Analysis of Somatic Copy Number Gains in Pancreatic Ductal … · 2013. 10. 18. · Rima Al-Awar, Quang Trinh and Lakshmi Muthuswamy for helpful discussions and feedback. I am also

Analysis of Somatic Copy Number Gains in Pancreatic Ductal Adenocarcinoma Implicates ECT2 as a Candidate Therapeutic Target

by

Nardin Samuel

A thesis submitted in conformity with the requirements for the degree of Master of Science

Department of Molecular Genetics

University of Toronto

© Copyright by Nardin Samuel 2012

ii

Analysis of Somatic Copy Number Gains in Pancreatic Ductal Adenocarcinoma Implicates ECT2 as a candidate therapeutic target

Nardin Samuel

Master of Science

Department of Molecular Genetics

University of Toronto 2012

Abstract

This study presents an integrated analysis of pancreatic ductal adenocarcinoma

(PDAC) for identification of putative cancer driver genes in somatic copy number gains

(SCNGs). SCNG data on 60 PDAC genomes was extracted to identify 756 genes, mapping to

20 genomic loci that are recurrently gained. Through copy number and gene expression

analysis on a panel of 29 human pancreatic cancer cell lines, this gene catalogue was

refined to 34 PDAC high-confidence candidate genes. The performance of these genes was

assessed in pooled shRNA screens and only ECT2 showed significant essentiality to cell

viability in specific PDAC cell lines with genomic gains at the 3q26.3 locus that harbor this

gene. Targeted shRNA-mediated interference of ECT2, as well as pharmacological

inhibition, are supportive of the pooled shRNA screen findings. These results favor ECT2 as

a candidate target gene for further evaluation in the subset of PDACs presenting with 3q26

somatic copy number gains.

iii

Acknowledgements

First I would like to acknowledge my supervisor and mentor, Dr. Thomas Hudson, for giving

me the opportunity to work in his lab and for his immense support, guidance and encouragement. I

also thank Dr. Jason Moffat for all of his support and for welcoming me into his lab to learn new

techniques and think critically about my work, as well as Azin Sayad and Dr. Kevin Brown from the

Moffat Lab, for their willingness to help with this project. I also thank Dr. Fei-Fei Liu and Dr. Brenda

Gallie for kindly mentoring me and serving on my supervisory committee.

I would also like to thank the entire Hudson lab, especially Mathieu Lemire, for all of his

help and support with statistical analyses. At OICR, I also thank Drs. David Uehling, Gennadiy Poda,

Rima Al-Awar, Quang Trinh and Lakshmi Muthuswamy for helpful discussions and feedback. I am

also very grateful to Dr. Troy Ketela, Kajaal Nagar, Sonali Weerawardane and Jasmyne Carnevale for

support with shRNA studies and for accommodating me in the lab. Lastly, I am so grateful for the

endless support of my wonderful family and friends.

During this work, a championed scientist, Dr. Ralph Steinman, was awarded one of the most

prestigious awards in research, a Nobel Prize in Medicine, but passed away from pancreatic cancer

before he could be presented with the award. For pancreatic cancer in particular, therapeutic

options are limited and it is these types of stories that remind me of the potential impact research

can achieve and inspire me to be involved in cancer research.

iv

Table of Contents

Acknowledgements……………………………………………………………………………………….……………………………iii Table of Contents…………………………………………………………………………………………….………………………….iv List of Tables……………………………………………………………………………………………………….…………………....vii List of Figures…………………………………………………………………………………………………………………………..viii List of Appendices……………………………………………………………………………………………………………………....x List of Abbreviations……………………………………………………………………………………………………………….…xi Chapter 1…………………………………………………………………………………………………………………………………….1 1 Introduction……………………………………………………………………………………………………………………………..1

1.1 Pancreatic Ductal Adenocarcinoma…………………………………………………………………………….1

1.1.1 Incidence and Mortality……………………………………………………………………………1

1.1.2 Molecular Biology of Pancreatic Ductal Adenocarcinoma…………………………...1

1.2 Current Therapeutic Options for Pancreatic Ductal Adenocarcinoma…………………………...2

1.2.1 Rationale for identifying novel molecular targets……………………………………….4 1.3 Somatic Mutations in Pancreatic Ductal Adenocarcinoma……………………………………………5

1.3.1 Driver vs. Passenger Mutations…………………………………………………………………5 1.3.2 Known Driver Mutations in Pancreatic Ductal Adenocarcinoma…………………6

1.4 Somatic Copy Number Gains in Human Caner……………………………………………………………..7

1.4.1 Methods for Genome-Wide Detection of Somatic Copy Number Gains………...7

1.4.2 Studies of Structural Mutations in Pancreatic Ductal Adenocarcinoma…..……8 1.5 Features of Ideal Therapeutic Targets……………………………………………………………………….10

1.6 Epithelial cell-transforming oncogene 2 (ECT2)………………………………………………………….10

1.6.1 ECT2 Structure and Function…………………………………………………………………..10

1.6.2 ECT2 Copy Number Gains and Over-Expression in Human Cancer…………….12

v

Chapter 2…………………………………………………………………………………………………………………………………..15 2 Identification of ECT2 as a Candidate Therapeutic Target Gene in Pancreatic Ductal

Adenocarcinoma………………………………………………………………………………………………………………….15

2.1 Introduction…………………………………………………………………………………………………………….15

2.2 Hypothesis………………………………………………………………………………………………………………16

2.3 Project Aims………………….…………………………………………………………………………………………16

2.3.1 Identification of Coding Regions of Recurrent Copy Number Gain in Human Pancreatic Ductal Adenocarcinoma…………………………………………………………16

2.3.2 Analysis of Candidate Gene List in an Independent Cohort of Human Pancreatic Ductal Adenocarcinoma Cell Lines …………….…………………………...16

2.3.3 Assembling a Catalogue of Candidate Genes for Further Study………………….17

2.3.4 Modulation of Candidate Target Gene by shRNA-Mediated Interference and

Pharmacological Approaches…………………………………………………………………..17

2.4 Materials and Methods……………………………………………………………………………………………18

2.4.1 Publically Available Pancreatic Ductal Adenocarcinoma Genome Datasets..18

2.4.2 Integrated Analysis of Pancreatic Cancer Genome Datasets………………………18

2.4.3 Copy Number Analysis of Candidate Genes in Human Pancreatic Ductal Adenocarcinoma Cell Lines…………………………………………………..…………………18

2.4.4 Gene Expression Analysis of Candidate Genes in Human Pancreatic Ductal

Adenocarcinoma Cell Lines……………………………………………………………………..19

2.4.5 Integrated Analysis of Copy Number and Gene Expression of Candidate Genes to Refine List of Putative Target Genes…………………………………………20

2.4.6 Assembly of Pancreatic Ductal Adenocarcinoma Candidate Target Gene

Database………………………………………………………………………………………………...20

2.4.7 Compilation of ‘Druggable Genome’ Database………………………………………….21

2.4.8 Integration of RNA-interference Pooled Screen Studies to Identify Candidate Target Gene for Laboratory-Based Study…………………………………………………21

2.4.9 Tissue Culture and Cell Lines…………………………………………………………………..22

2.4.10 ECT2 and Control Lentivirus Production………………………………………………….22

2.4.11 Lentivirus Titration………………………………………………………………………………...23

vi

2.4.12 Cell Viability Assay in shRNA Experiment………………………………………………..24

2.4.13 Pharmacologic Modulation Assay……………………………………………………………24

2.5 Results…………………………………………………………………………………………………………………….26

2.5.1 Genomic Regions of Recurrent Somatic Copy Number Gains in Pancreatic

Ductal Adenocarcinoma………………………………………………………………………….26

2.5.2 Integrated Copy Number and Expression Analysis of Candidate Genes……..33

2.5.3 Database of Top-Ranked Candidate Target Genes……………………………………36

2.5.4 Identification of ECT2 for Laboratory Study Through Integration of shRNA Pooled Screen Results……………………………………………………………………………..39

2.5.5 Targeted shRNA studies of ECT2 in Pancreatic Ductal Adenocarcinoma Cell

Lines………………………………………………………………………………………………………43

2.5.6 Functional Effects of Pharmacologic Inhibition of the ECT2 Pathway on Cell Viability…………………………………………………………………………………………….……61

Chapter 3…………………………………………………………………………………………………………………………………..64

3 Discussion…………………………………………………………………………………………………………………………...64

3.1 Pooling Data from Genome-Wide Analyses………………………………………………………………..64

3.2 Analysis of Top-Ranked Candidate Genes and Identification of ECT2 as a Putative Target…………………………………………………………………………………………………………………...…65

3.3 Dependence on ECT2 for Cell Viability in Cell Lines Bearing a Genomic Gain at the 3q26

Locus.....................................……………………………………………………………………………….…………...67

3.4 Differential Sensitivity to Inhibitors of ECT2-Mediated Cellular Pathway in Cell Lines Bearing Genomic Copy Number Gains at the 3q26 Locus………………………....………..………68

3.5 Future Directions……………………………………………………………………………………………………..70

3.5.1 Rationale……………………………………..……………………………………………..…………..70

3.5.2 Specific Aims………………………………………………………………………………………….71 References……….……………………………………………………………………………………………………………………….72 Appendices…………………………………………………………………………………………………………………………….…80

vii

List of Tables

Table 1 Genomic loci encompassed in SCNGs identified in this study…. …………………...………………….29 Table 2 Regions of genomic gain identified in this analysis of pancreatic tumors as well as a survey of 26 histological subtypes in human cancer by Beroukhim et al, 2011………………………………………..32 Table 3 Database of top-ranked candidate PDAC genes……………………………………………………………….38 Table 4 Results of copy number measures for ECT2 in cell lines utilized for targeted shRNA analyses obtained through different computational methods……………………………………………………………………45 Table 5 Copy number analysis of pancreatic cancer cell lines in Barretina J, et al. 2012……………..…46 Table 6 Comparison of targeted shRNA analysis with results from pooled shRNA screen……………..60

viii

List of Figures Figure 1 ECT2 protein structure……………………………………………….………………………………………………..11 Figure 2 ECT2 is mislocalized to the cytoplasm of primary non-small lung cancer tumors…………....13 Figure 3 Number of genes encompassed in genomic gains multiple datasets………………………………..26 Figure 4 Number of genomic loci gained when assessing the datasets inclusive of the OICR dataset (AJH) and inclusive of the OICR dataset (AJHO)…………………………………………………………………………..27 Figure 5 Bioinformatic approach to identifying genes for further analysis…………....……………………...28 Figure 6 Circos plot depicting common regions of genomic gains………...………………………………………30 Figure 7 Comparison of 20 loci identified in this study with other pancreatic copy number studies in the literature…………………………………………………………………………………………………………………………..…31 Figure 8 Peak regions of genomic amplification identified in a survey of 3 131 tumor specimens belonging broadly to 26 histological subtypes…………………………………………………………………………….32 Figure 9 Mean probe intensity for assigning continuous copy number measure…………..………………33 Figure 10 Association between Sum of Ranks and Spearman Rank Correlation Coefficient for PDAC genes……………………………………………………………………………………………………………………………………..….34 Figure 11 Representative copy number and gene expression correlation plots………………..…………..35 Figure 12 Distribution of correlation coefficients in the top 5% most highly correlated genes in comparison to multiple simulations of random sets of gene…………………………………………...……………36 Figure 13 shRNA pooled screen results for top-ranked candidate genes……………………..……………….40 Figure 14a Comparison of essentiality scores of ECT2 in PDAC cell lines with copy number gains and cell lines in which ECT2 is diploid……………………………………………………………………...……………...…41 Figure 14b Comparison of essentiality scores of DPAC essential genes with copy number gain.……42 Figure 15 shRNA pooled screen results for ECT2…………………………………………………………..……………44 Figure 16 Comparison of Mean Probe Intensity (MPI) copy number estimation approach with Circular Binary Segmentation (CBS) for copy number estimation of ECT2……………………………..…….46 Figure 17a-e Copy number plots for ECT2………………………………………………………………………….….47-49 Figure 18a-j Targeted shRNA-mediated interference of ECT2 in PDAC cell lines………………..……51-56 Figure 19 Targeted shRNA experiment results…………………………………………………………….……………..58

ix

Figure 20 Comparison of targeted shRNA-mediated ECT2 interference with shRNA pooled screen results………………………………………………………………………………………………………………………………...…….59 Figure 21 Pharmacological modulation of ECT2-mediated oncogenesis…………………………………….…61 Figure 22 Treatment of PDAC cell lines with PLK1 inhibitors………………………………………………..……..62

x

List of Appendices

Table A1 Focal somatic copy number gains in pancreatic ductal adenocarcinoma in the literature.81 Table A2 Public pancreatic cancer genome datasets utilized in copy number gain analysis…………..85 Table A3 Cell Lines utilized in integrated analysis in this study…………………………….……….…………….87

Table A4 The RNAi Consortium (TRC) shRNA Constructs……..…………………………………..….………….….88 Table A5 Puromycin concentrations used in shRNA experiments……..…………………………………………88 Table A6 Details of PLK1 compounds utilized in pharmacologic assay…………………………….…….……..88

xi

List of Abbreviations

aCGH Array comparative genomic hybridization AFG3L2 AFG3 ATPase family gene 3-like 2 ATCC American Type Culture Collection BAC Bacterial artificial chromosome bps base pairs BRAF v-raf murine sarcoma viral oncogene homolog B1 BRCT BRCA1 C-terminal domain CBS Circular binary segmentation DH Dbl homology domain DMEM Dulbecco’s Modified Eagle’s Medium ECT2 Epithelial Cell-Transforming Oncogene 2 FAK Focal adhesion kinase FISH Fluorescence in situ hybridization FBS Fetal bovine serum GARP Gene Activity Rank Profile GBM Glioblastoma multiforme GDP Guanosine diphosphate GEF Guanine nucleotide exchange factor GFP Green fluorescent protein GTP Guanosine triphosphate GTPase Guanosine triphosphate hydrolase HEK293T Human embryonic kidney 293 SV40 large T-antigen HER2 Human Epidermal Growth Factor Receptor 2

xii

IMDM Iscove’s Modified Dulbecco’s Medium JHSF Japan Health Sciences Foundation kb Kilobase pairs KRAS V-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog LACZ Gene Z of lac operon LSCC Lung squamous cell carcinoma LUC Luciferase Mb megabase MITF Micropthalmia-asssociated transcription factor MELK Maternal embryonic leucine zipper kinase MPI Mean Probe Intensity mRNA Messenger ribonucleic acid MYC v-myc myelocytomatosis viral oncogene homolog NKX2-1 NK2 homeobox 1 NLS Nuclear localization sequence nM Nanomolar nm Nanometer NSCLC Non-small cell lung cancer p53 Tumor protein 53 PALB2 Partner and localizer of BRCA2 PBS Phosphate-buffered saline PCR Polymerase chain reaction PDAC Pancreatic Ductal Adenocarcinoma PH Pleckstrin-homology domain PLK1 Polo-like kinase 1

xiii

PSMD1 Proteasome 26 subunit, non-ATPase, 1 RALY RNA binding protein, autoantigenic (hnRNP-associated with lethal yellow

homolog) RhoGEF Rho Guanine nucleotide exchange factor RNAi RNA-interference RPMI Roswell Park Memorial Institute Medium RPS15 Ribosomal protein S15 RT Reverse transcriptase SCNA Somatic Copy Number Alteration SCNG Somatic Copy Number Gain shARP shRNA Activity Rank Profile shRNA Short hairpin RNA SNRPD1 Small nuclear ribonucleiprotein D1 UCSC University of California Santa Cruz VCP Valosin-containing protein VST Variance-Stabilized Transformation WNK1 WNK lysine deficient protein kinase 1 WST1 Water soluble disulfonated tetrazolium (4-[3-(2-methoxy-4-nitrophenyl)-2-

(4-nitrophenyl)-2H-5-tetrazolio]-1,3-benzene disulfonate XRCC1 X-ray repair complementing defective repair in Chinese hamster cells 1

domain z-GARP z-Normalized Gene Activity Rank Profile

1

Chapter 1

1 Introduction

1.1 Pancreatic Ductal Adenocarcinoma

1.1.1 Incidence and Mortality

Pancreatic ductal adenocarcinoma (PDAC) is the fourth leading cause of cancer-related

mortality in North America (Jemal et al., 2010). Patients with PDAC present with the most dismal

prognosis of all solid tumors and this fact has remained unchanged over the past 50 years despite

advances in the molecular understanding of pancreatic cancers (Jemal et al., 2010; Yeo et al., 2002).

Among those patients who are diagnosed with pancreatic cancer, only 20% present with a tumor in

situ and are thus candidates for surgical resection with curative intent (Li et al., 2004). The survival

rate is only 2% for the majority of patients who present with metastatic disease, indicating that the

incidence of this malignancy approximates its mortality (Jemal et al., 2010).

1.1.2 Molecular Biology of Pancreatic Ductal Adenocarcinoma

Histologically differentiated pancreatic ductal adenocarcinomas (PDACs) comprise >90% of

exocrine pancreatic malignancies (Kloppel, 1998). Other pancreatic neoplasms such as

undifferentiated tumors, acinar cell carcinomas and cystadenomas, along with endocrine pancreatic

tumors are rare; hence, these tumor types are not within the scope of this thesis. Hereafter, use of

the term ‘pancreatic cancer’ will refer exclusively to PDACs.

The origin of pancreatic tumors has been the subject of much debate, as identification of the

pancreatic cells that are transformed into malignant lesions has been challenging. In an attempt to

characterize the origins of pancreatic tumors, a progression model of neoplasia has been posited

whereby precursor lesions give rise to malignant tumors. Precursor lesions include pancreatic

intraepithelial neoplasms (PanINs), intraductal papillary mucinous neoplasms (IPMNs) and

mucinous cystic neoplasms. Among these lesions, PanINs are the best characterized both

genetically and histologically. The first and most widely accepted progression model proposes that

PanINs arise from normal pancreatic ductal epithelium and progressively give rise to carcinoma in

situ, then invasive pancreatic cancer (Hruban et al., 2000).

2

Only a small subset of precursor lesions tends to be at risk of progressing to an invasive

phenotype. The underlying genetic characteristics of tumors might ultimately aid in the

identification of precursor lesions, potentially through the discovery of genetic mutations that are

common to both PanINs and the primary tumor. At the transcriptional level, high-grade PanINs

have demonstrated differential mRNA expression similar to that of PDACs, in comparison with

normal pancreatic ductal epithelium and acinar cells (Buchholz et al., 2005). Changes in gene

expression suggest that stage 2 PanINs might represent the earliest preneoplastic lesions and stage

3 PanIN lesions have a transcriptional profile nearly identical to that of PDACs (Buchholz et al.,

2005).

1.2 Current Therapeutic Options for Pancreatic Ductal Adenocarcinoma

The current standard therapy for both locally advanced pancreatic cancer and metastatic

pancreatic cancer involves a chemotherapeutic regimen with gemcitabine, a cytotoxic nucleoside

analogue. However, the clinical benefit derived from systemic gemcitabine therapy is a meager

average survival of less than 6 months, which is the median survival without therapy (Burris et al.,

1997). Combinations of gemcitabine and other chemotherapeutic agents have consistently failed to

yield statistically or clinically meaningful improvements in survival (Moore et al., 2007). Only in the

past year has a new therapeutic regimen, FOLFIRINOX (a combination of fluorouracil, folinic acid,

irinotecan and oxaliplatin), contested the routine use of gemcitabine in PDAC. In the ACCORD 11

trial led by Conroy and colleagues, the median overall survival increased from 6.5 months with

standard gemcitabine therapy to 11.1 months with FOLFIRINOX in patients with advanced

metastatic disease (Conroy et al., 2011). This finding represents the highest response rate ever

reported in a phase III clinical trial in patients with pancreatic cancer. However, FOLFIRINOX is

highly toxic and the safety information regarding this regimen is lacking. Despite this potential new

shift in the management of pancreatic cancer, the median survival that can be achieved with

therapy remains short and the standard of administering therapy to all patients remains,

irrespective of the underlying tumor genetics.

Treating patients with pancreatic cancer is a challenge for numerous reasons. The poor

prognosis of this disease and its resistance to chemotherapies can be partially attributed to

diminished drug delivery secondary to the dense stromal barriers that surround the tumor, which

decreases the amount of vasculature available for drug delivery to the tumor site (Olive et al.,

3

2009). Indeed, one of the most prominent features of pancreatic cancer is the extensive stromal

reaction, which comprises up to 90% of the tumor volume (Chu et al., 2007; Neesse et al., 2011).

One review highlights other potential factors that might underlie chemoresistance (Wang et al.,

2011), such as the transformation of epithelial cells to a mesenchymal phenotype (Wang et al.,

2009; Shah et al., 2007). In addition, studies suggest that pancreatic cancer stem cells might be

involved in resistance to chemotherapies; however, the mechanism by which they confer resistance

remains unclear (Hermann et al., 2007; Hong et al., 2009).

Our understanding of chemoresistance is complicated by the fact that marked molecular

genetic heterogeneity exists among primary tumor cells and among those cells that are capable of

producing metastases (Campbell et al., 2010). Indeed, the lack of success of current clinical

interventions for PDACs can be partly attributed to the heterogeneity of molecular abnormalities

among patients’ tumors, which leads to differing patient responses to standard cytotoxic therapies.

In addition, the various histological subtypes of PDACs are associated with distinctive genetic

features and also exhibit different prognoses (Hong et al., 2011a).

4

1.2.1 Rationale for Identifying Novel Molecular Targets As previously mentioned, genomic studies of pancreatic cancer are uncovering substantial

heterogeneity in the genetic alterations of this disease (Li et al., 2004). Parallel studies also

demonstrate that this heterogeneity exists not only among different patients, but within different

tumors and metastatic lesions from the same patient (Campbell et al., 2010). This observation

underscores the need for a mutational-targeted approach to cancer therapeutics, as it is clear that

patients have differing dominant genetic alterations that distinguish their respective malignancies.

Since mutated cancer genes have an essential role in malignant transformation, they are excellent

targets to be exploited for drug therapy (Stratton, 2011).

One study that demonstrated the success of applying principles of personalized medicine to

PDAC in particular employed global genomic analysis for genomic profiling of a patient’s tumor and

subsequent administration of a chemotherapeutic agent on the basis of the tumor’s mutational

characteristics. The patient was treated with mitomycin C (a DNA damaging agent) on the basis of

the observation of substantial activity of agents that damage DNA on a xenograft generated from

the patient’s primary tumor. Exomic sequencing revealed biallelic activation of the PALB2 gene

(Villarroel et al., 2011). This mutation provided the basis for the patient’s response to mitomycin C,

as PALB2 is involved in DNA repair, which confers a growth advantage in the tumor yet renders it

susceptible to an agent that damages DNA, antagonizing a mechanism that enables the tumor to

thrive.

Taken together, these examples clearly demonstrate that the rational approach of

identifying mutations and gearing therapy towards genetic features holds promise for novel cancer

therapeutics.

5

1.3 Somatic Mutations in Pancreatic Ductal Adenocarcinoma 1.3.1 Driver vs. Passenger Mutations

Although genetic mutations that are inherited in the germline can lead to familial cancer

syndromes, accumulation of somatic mutations, in the presence or absence of inherited baseline

susceptibility, is thought to drive and propagate the neoplastic process (Harris and McCormick,

2010). Only 5% of patients diagnosed with pancreatic cancer have a familial form of the disease,

which underscores the importance of understanding the role of somatic changes in driving cellular

transformation and developing therapies accordingly (Lynch et al., 1996). Although a wealth of

information on somatic mutations in a range of cancer genomes has been generated as a result of

the latest advances in sequencing technologies, uncertainties remain in distinguishing the

molecular events that drive cancer progression and the so-called ‘passenger’ mutations that are

present in the tumor but do not drive tumorigenesis. By contrast, ‘driver’ mutations promote tumor

growth, are positively selected, and are central to cancer development.

Passenger mutations may be randomly distributed in the genome, whereas driver

mutations occur in a subset of genes in a non-random pattern. However, this notion might not be

valid in all instances. For example, passenger mutations might arise as a result of an increased

mutation rate at a given genomic locus and such localized, nonrandom mutations could be

erroneously identified as potential driver mutations. Nonetheless, large-scale sequencing of cancer

genomes is yielding highly comprehensive genomic datasets that has led to the identification of

driver mutations. These datasets have been used to successfully distinguish deletions that

encompass tumor suppressor genes, which are driver mutations, from passenger deletions at

fragile sites in the genome (Bignell et al., 2010). Fragile sites are particularly prone to breakage and

mutation, and nonrandom mutations occurring at such sites are most often passenger mutations.

This example demonstrates the essentiality of downstream characterization studies of implicated

genes and genomic regions to delineate those somatic mutations that are drivers to enable them to

be targeted therapeutically and/or used as clinical biomarkers.

6

1.3.2 Known Driver Mutations in Pancreatic Ductal Adenocarcinoma

A subset of genes that can drive cancer formation has now been well characterized in a

model of PDAC development. In this progression model, activating mutations in the KRAS oncogene

are nearly universal, occurring in almost 100% of PDAC tumors (Klimstra and Longnecker, 1994;

Rozenblum et al., 1997). Highly oncogenic single base-pair substitutions in codon 12 of KRAS can

drive pancreatic cancer and are indeed hallmark features of PDAC. This mutation is involved in cell

proliferation, inhibition of apoptosis, interference with cellular cohesion and enhanced metastatic

properties that might partly result from activation of FAK and decreased expression of E-cadherin

(Rachagani et al., 2011). However, mutations in KRAS and enhanced activation of the Ras–Raf–

MAPK signaling pathway (through which KRAS acts) are alone insufficient for malignant trans-

formation. Mutations in KRAS occur even in subsets of PanINs and IPMNs that do not progress to

invasive malignancies (Hruban et al., 2000). Activating mutations in other oncogenes such as BRAF,

AKT2 and MYB have also been reported in pancreatic cancer (Calhoun et al., 2003; Cheng et al.,

1996; Ruggeri et al., 1998; Wallrapp et al., 1997).

Somatic mutations in tumor suppressor genes such as CDKN2A (also known as p16), TP53

(also known as p53) and SMAD4 (also known as DPC4) have been identified as driver mutations

occurring in high frequency in PDACs (Caldas et al., 1994; Ruggeri et al., 1992; Hahn et al., 1996).

The tumor protein p53 has several essential roles in the progression of the cell cycle, apoptosis and

DNA repair (Vogelstein and Kinzler, 2004). Functional inactivation of p53 enables cell proliferation

to occur in spite of DNA damage and subsequently facilitates tumor progression (Vogelstein and

Kinzler, 2004). Cyclin-dependent kinase inhibitor 2A (CDKN2A, also known as p16) interacts with

cyclin-dependent protein kinases such as CDK4 and CDK6 to inhibit progression of the cell cycle at

the G1/S checkpoint (Maitra and Hruban, 2008). SMAD4 has an essential role in the transforming

growth factor β (TGF-β) cellular pathway. Inactivation or deletion of SMAD4 results in loss of

SMAD4-dependent TGF-β signaling, which leads to aberrant cellular proliferation (Hong et al.,

2011b).

It is important to note that while these driver mutations are characteristic of PDAC and

occur in high frequency in PDACs, they are all observed across many other cancer types, indicating

that some driver mutations may be universal in cancer but other less frequent, tumor-specific

alterations may drive cancer progression.

7

1.4 Somatic Copy Number Gains in Human Cancer

Among the different types of somatic genetic alterations that contribute to cancer

development, somatic copy number alterations (SCNAs) are highly common in cancer (Baudis,

2007; NCI/NCBI, 2001). Study of genes that are potentially affected by SCNAs, such as copy number

gains and losses, can inform pancreatic cancer pathogenesis and may hold promise for prospective

drug development efforts aimed at improved clinical management of PDAC. In particular, gene

targets of somatic copy number gains (SCNGs) are of interest since their expression is likely to be

upregulated due to increased copy, or gene dosage, and may be thus amenable to selective targeting

through pharmacological or genetic modulation. Copy number amplification refers to high-level

SCNGs, with an increase of five or more copies of a DNA segment less than 20Mb in length (Brodeur,

1998).

Many examples in the literature demonstrate the success of studying SCNGs in identifying

cancer-driving genes and associated therapeutic approaches. In one study, integrating genome-

wide maps of copy number alterations in melanoma with gene expression signatures, the MITF

gene was identified as a critical melanoma oncogene (Garraway et al., 2005). Similarly, a large-scale

study of copy number alterations in primary lung adenocarcinomas identified recurrent SCNGs in

known lung adenocarcinoma loci, but also identified a highly recurrent 14q13.3 amplification

encompassing NKX2-1 (Zender et al., 2006). Further downstream functional analyses substantiated

the finding that NKX2-1 is a proto-oncogene in lung adenocarcinoma. These studies, among others,

demonstrate the utility of studying SCNGs in human cancer for the identification of genes that are

critical to cancer development and progression.

1.4.1 Methods for Genome-Wide Detection of Somatic Copy Number Gains

Molecular detection of SCNGs has advanced significantly over the past decade to enable

identification of regions of genomic gain with higher resolution. Comparative genomic

hybridization (CGH) detects and maps DNA sequence SCNAs throughout the genome. Conventional

CGH techniques for identifying SCNAs, such as chromosomal CGH utilizing metaphase

chromosomes, permitted detection of SCNAs with limited resolution (10-20Mb). Array-based CGH

(aCGH) enabled further localization of SCNAs at the cytoband level. In aCGH, or DNA microarrays,

relative copy number is measured in specific genomic regions represented by arrays of mapped

clones or oligonulceotides. Fluorescently labeled test and reference DNAs are hybridized to the

8

array and the resulting ratio of the fluorescence intensities at each locus approximates the ratio of

the copy numbers of the corresponding DNA sequences in the test and reference genomes

(Albertson, 2006). Contrary to chromosomal CGH, DNA microarray resolution is determined by the

spacing of the array elements, and current SNP-based arrays permit enhanced resolution through

use of highly dense array elements (Pinkel and Albertson, 2005).

Fluorescence in situ hybridization (FISH) is a cytogenetic method to detect SCNGs,

particularly high-copy amplifications (≥10 copies). Nucleic acid probes labeled with a

fluorochrome-conjugated nucleotide can be detected by fluorescently-labeled molecules. The

labeled probe is then hybridized to tissue or metaphase chromosomes and the nucleic acid

sequence is visualized by fluorescence microscopy (Albertson, 2006). FISH has traditionally been

the method of choice for clinical diagnostics related to SCNA detection.

Bacterial artificial chromosome (BAC) end sequencing is another, less widely used method

to measure genomic aberrations. A BAC library is constructed from a test genome, BAC end

sequences are obtained, and the end-sequence pairs are mapped onto a reference genome. Copy

number is inferred from the density of BAC end sequences and from BAC end pairs which map

abnormally far apart from the reference genome sequence (Volik et al., 2003).

More recently, with advances in sequencing technologies and their mainstream use in

genomics research, high-throughput sequencing has been used as a tool not only to detect base-pair

level mutations, but also complex structural rearrangements and SCNAs. In comparison to DNA

microarrays, sequence read alignment has been shown to have comparable power to detect SCNAs

and has over twofold improved precision for localizing SCNA breakpoints to within 1kb (Chiang et

al., 2009).

1.4.2 Studies of Structural Mutations in Pancreatic Ductal Adenocarcinoma

Chromosomal instability, manifesting as SCNAs and structural genomic alterations is highly

characteristic of PDAC (Campbell et al., 2010). Various techniques for whole-genome analysis have

led to the identification of many regions of genomic gain and loss in pancreatic cancer and these

SCNAs likely harbor genes that are involved in PDAC progression as a result of their high

recurrence and localization in known loci of cancer genes (for example, the 8q24.3 locus harboring

the MYC oncogene). Genes affected by SCNAs have indeed been shown to have a role in PDAC

progression at the cellular level in further functional examinations.

9

Various techniques for whole-genome analysis have led to the identification of many

regions of SCNG in pancreatic cancer (Appendix Table A1). One study of genome-wide copy number

analysis in PDAC combined array-based comparative genomic hybridization (CGH) results from 24

pancreatic cancer cell lines to hone in on a recurrently amplified region, 7q21.3-22.1 (Suzuki et al.,

2008). This strategy enabled the identification of a candidate oncogene, SMURF1, through various

biological validation studies, including knockdown of this gene and immunohistochemical assays

(Suzuki et al., 2008). Another study, which used representational oligonucleotide microarray analy-

sis and subsequent biological validation methods such as quantitative PCR and fluorescence in situ

hybridization, reported copy number gain and overexpression of the transcription factor GATA6 in

pancreatic carcinoma (Fu et al., 2008). Various other studies have identified gains in genetic copy

number at the 18q11.2 locus, which contains GATA6 (Heidenblad et al., 2004; Holzmann et al.,

2004; Kitoh et al., 2005; Loukopoulos et al., 2007). A follow-up study characterized the putative

mechanism through which GATA6 contributes to PDAC tumorigenesis during progression of PanINs

through the Wnt signaling pathway (Zhong et al., 2011). These examples demonstrate how focused

analysis of genes harbored in SCNAs can lead to identification of genes that are involved in PDAC

development and might therefore serve as candidate biomarkers or therapeutic targets.

One study utilized next-generation sequencing to survey structural genetic changes in 13

pancreatic cancer samples and identified a novel structural alteration – fold-back inversions

(Campbell et al., 2010). Fold-back inversions are copy number alterations whereby a genomic

region is duplicated but the two copies re-join in an abnormal head-to-head position in the opposite

orientation to the amplification breakpoint. Due to the nature of the orientation, this type of

structural genomic aberration could have only been detected through anomalous mapping of

sequencing reads. This study demonstrated that fold-back inversions occur early during the

development of pancreatic cancer and frequently trigger amplification of genes that can drive

cancer progression.

10

1.5 Features of Ideal Therapeutic Targets

The need for identification of novel therapeutic targets for PDAC is clear and moreover,

putative cancer-driving genes may be harbored in SCNGs, indicating that therapies targeting genes

in SCNGs may attenuate cancer growth. Features of an ideal drug target include: 1) The target plays

an essential role in cancer genesis or maintenance of the cancer phenotype; 2) The target is

overexpressed in cancer cells, and this over-expression is associated with a biomarker; 3)

Inhibition of the target’s expression induces growth suppression and/or apoptosis in cancer cells;

4) The target is ‘druggable’, meaning that is amenable inhibition by a small-molecule or specific

antibody; and 5) The target is not expressed, or expressed at very low levels, in normal cells and its

inhibition has minimal effect on normal cell growth and function (Sun, 2006).

Toward this end, the model of targeting identified genetic changes has proven effective in

various examples. One archetypal example is Herceptin™ (Trastuzumab), a therapeutic antibody

that targets the protein encoded by amplified HER2, which has greatly impacted the prognoses of

patients with breast cancer bearing this genomic feature (Esteva et al., 2010). In another example, a

single base-pair substitution (that results in the amino-acid substitution Val600Glu) in the BRAF

oncogene, which encodes a serine-threonine kinase, has successfully been the target of drug

development. Vemurafenib, a BRAF-targeted agent, was recently approved for use in patients with

metastatic melanoma by the Food and Drug Administration (FDA). Indeed, the rational approach of

identifying mutations and gearing therapy towards genetic features holds promise for enhanced

cancer therapeutics.

1.6 Epithelial cell-transforming oncogene 2 (ECT2) 1.6.1 ECT2 Structure and Function

The candidate therapeutic target identified in this study, Epithelial cell-transforming

sequence 2 oncogene (ECT2), is a member of the guanine nucleotide exchange factor (GEFs) family,

which catalyze the exchange of GDP for GTP, thereby activating Rho Guanine triphosphatases

(GTPases) in signal transduction (Fields and Justilien, 2010). GTPases function as molecular

switches regulating numerous signaling pathways that are involved in actin cytoskeleton

remodeling, cell motility, cell adhesion, cell cycle progression and gene expression (Fields and

11

Justilien, 2010). Since GTPases cycle between an active state whereby GTP is bound, and a GDP-

bound inactive state, GEFs therefore regulate GTPase activity.

Mammalian ECT2 was first isolated as a proto-oncogene from a murine keratinocyte cDNA

expression library (Miki et al., 1993). This gene is highly evolutionarily conserved. The Drosophila

ortholog of ECT2, Pbl was discovered prior to the identification of the mammalian gene, and was

found to function as a Rho-GEF required for cytokinesis (Schumacher et al., 2004). ECT2 orthologs

have also been identified in Caenorhabditis elegans (Let-21) and in Xenopus (XECT2), and share

similarities with human ECT2 throughout their coding sequence (Dechant and Glotzer, 2003;

Tatsumoto et al., 2003).

Human ECT2 consists of an 883 amino acid chain with several protein domains (Figure 1

[Fields and Justilien, 2010]). The N-terminal regulatory domain contains sequences that are highly

homologous to cell cycle control and repair proteins (Saito et al., 2003). The adjacent XRCC1

domain is homologous to the human XRCC1 protein involved in DNA repair and sister chromatid

exchange (Thompson et al., 1990). The Cyclin B6 domain shows sequence homology to Clb6, a yeast

protein involved in the G1-to-S phase cell-cycle progression. Adjacent are two repeating BRCT

(Breast Cancer Gene 1 Carboyl-terminal) motifs. These motifs are highly conserved in DNA repair

proteins and proteins involved in cell-cycle progression (Bork et al., 1997; Callebaut and Mornon,

1997). The center of the protein contains a small central (S) domain harboring two nuclear

localization sequences (NLSs) which may be involved in ECT2 nuclear localization. The C-terminus

of ECT2 contains its catalytic component consisting of a Dbl-homology (DH) and a pleckstrin-

homology (PH) domain which are responsible for the function of ECT2 as a RhoGEF. The adjacent C-

terminal (C) region of ECT2 does not exhibit significant homology to any know protein domains.

Figure 1. ECT2 protein structure (Fields and Justilien, 2010; Permission to re-use obtained; License No: 2891430429871). N, Amino-terminal region; XRCC1, X-ray repair complementing defective repair in Chinese hamster cells 1 domain; Cyclin B6, cyclin B6-like domain; BRCT, BRCA1 C-terminal domain; S, small central region; NLS, nuclear localization sequence; DH, Dbl-homology domain; PH, pleckstrin-homology domain; C, Carboxyl-terminal region.

With regards to intracellular localization and expression, ECT2 expression is controlled

throughout mitosis. ECT2 remains confined to the nucleus during interphase, and translocates into

the cytoplasm following the disappearance of the nuclear envelope at the onset of mitosis. During

12

metaphase, ECT2 is localized to the mitotic spindles, the cleavage furrow during telophase, and the

mid-body as cytokinesis ceases (Tatsumoto et al., 1999). Analysis of mRNA expression patterns

reveals ECT2 is expressed in adult tissues such as kidney, liver, spleen, testis, lung, bladder, ovary

and the brain, as well as fetal tissues such as the liver, thymus, epithelial lining of the nasal cavity

and gut, tooth primordial, costal cartilage, heart, lung and pancreas (Miki et al., 1993; Saito et al.,

2003).

Several studies have demonstrated the essential role of ECT2 and its orthologs in

cytokinesis. In Drosophila, the ortholog of ECT2, Pbl activates Rho1 to promote cytokinesis

(Prokopenko et al., 1999). Similarly, Let-21, the C. elegans ortholog of ECT2 is required for

formation of the cleavage furrow (Dechant and Glotzer, 2003). In addition, perturbation of

mammalian ECT2 results in failure of cytokinesis, as observed by the accumulation of

multinucleated cells (Kim et al., 2005; Liu et al., 2004; Tatsumoto et al., 1999). There is also strong

evidence in the literature suggesting ECT2 is involved in cell polarity and asymmetrical cell

division. Apical-basal polarity in epithelial cells is regulated by the Par complex which consists of

Par-6/Par-3 (partition-defective)/atypical protein kinase C and small GTPases. One study reported

that ECT2 is detectable at cell junctions where it directly interacts with Par6 and PRKCζ and

regulates the activity of the latter (Liu et al., 2004). ECT2 has been implicated in regulating the

RhoGTPase Cdc42 and attachment of spindle microtubules to kinetochores during metaphase

(Oceguera-Yanez et al., 2005). Taken together, it is clear that ECT2 functions as a GEF for

RhoGTPases.

1.6.2 ECT2 Copy Number Gains and Over-Expression in Human Cancer

ECT2 has been identified as an oncogene in human cancer and its role in cancer has been

linked to genomic amplification and upregulated expression. The first study identifying ECT2 as a

proto-oncogene demonstrated it was capable of transforming fibroblasts (Miki et al., 1993). Since

then, ECT2 has been reported to be over-expressed in several human tumors including brain

(Salhia et al., 2008; Sano et al., 2006), lung (Hirata et al., 2009; Justilien and Fields, 2009), bladder

(Saito et al., 2004), esophageal (Hirata et al., 2009), pancreatic (Zhang et al., 2008), and ovarian

cancer (Saito et al., 2004). ECT2 is over-expressed at both the mRNA and protein levels in non-small

cell lung cancer (NSCLC) cell lines and primary tumors and interestingly, immunohistochemical

analysis showed that ECT2 is localized in the nucleus in normal lung tissue, but also appears to

translocate to some extent in the cytoplasm in primary NSCLC, and an independent analysis

13

validated these findings by showing cytoplasmic ECT2 staining in approximately 84% of primary

NSCLC (Figure 2; [Justilien and Fields, 2009]). These findings were also reproduced in glioblastoma

multiforme (GBM), whereby ECT2 was found over-expressed and mislocalized to the cytoplasm in

comparison with normal brain tissue and low-grade astrocytomas (Salhia et al., 2008).

Figure 2. ECT2 is mislocalized to the cytoplasm of primary non-small cell lung cancer tumors (Fields and Justilien, 2010. Permission to re-use obtained; License No: 2891430429871). Immunohistochemical staining of ECT2 in normal human lung epithelium (left) and primary lung adenocarcinoma (right) reveals ECT2 is primarily localized to the nucleus in normal lung tissue but localizes to both the nucleus and cytoplasm of primary tumor cells.

The 3q26 locus which harbors ECT2 has been reported to be the target of frequent

chromosomal alterations in human cancer (Lin et al., 2006; Meyer et al., 2007). In lung squamous

cell carcinoma (LSCC), ECT2 mRNA expression correlates with ECT2 copy number gains, indicating

the ECT2 amplification may be driving its over-expression in LSCCs (Justilien and Fields, 2009). In

addition, an estimated 40% of esophageal squamous cell carcinoma (ESCC) tumors bear 3q26

amplifications (Yang et al., 2008; Yen et al., 2005). ECT2 was reported to be over-expressed in

ovarian tumors that harbor ECT2 copy number gains in comparison to normal ovary tissue

(Haverty et al., 2009). 3q26 amplifications encompassing ECT2 have also been observed in head

and neck squamous cell carcinoma as well as cervical squamous cell carcinoma (Heselmeyer et al.,

1997). Taken together, evidence in the literature strongly suggestion that ECT2 over-expression

may be driven by tumor specific amplification of the 3q26 amplicon which harbors ECT2. However,

ECT2 amplification alone is likely not the only mechanism by which ECT2 tumor-specific over-

expression is observed.

14

While the exact mechanisms of ECT2-mediated oncogenesis remain unclear, studies in

NSCLC and GBM suggest that ECT2 is important for proliferation, migration and invasion. The

oncogenic role of ECT2 is distinct from its normal physiological role in cytokinesis, and in NSCLC,

the role of ECT2 in cellular transformation appears to be related to its cytoplasmic mislocalization.

15

Chapter 2

2 Identification of ECT2 as a Candidate Therapeutic Target Gene in Pancreatic Ductal Adenocarcinoma

N.B. Contributions: Microarray experiments for gene expression and copy number on 29 PDAC cell lines were performed at the University Health Network (UHN) Microarray Center through the laboratory of Dr. Jason Moffat. Azin Sayad assisted with processing copy number data through the GPHMM algorithm as well as representation of shRNA pooled screen data in Figure 15.

2.1 Introduction

Current therapeutic options for pancreatic ductal adenocarcinoma are limited and do not

confer any improvement in overall disease progression or survival for the majority of patients. The

goal of this project was to integrate genomic data from primary PDACs to identify genes that are

targets of recurrent genomic mutation, and subsequently study the oncogenic potential of their

mutation in PDAC tumorigenesis.

Among the myriad of genetic mutations that can occur in human cancer, structural genomic

mutations, resulting from chromosomal instability, are characteristic of PDAC (Campbell et al.,

2010). Such aberrations can manifest as various somatic chromosomal changes, including

translocations, inversions and somatic copy number alterations (SCNAs). In particular, gene targets

of somatic copy number gains (SCNGs) are of interest since their expression is likely to be

upregulated due to increased copy, or gene dosage, and may be thus amenable to selective targeting

through pharmacological modulation.

In order to identify genes that may be effective therapeutic targets, it is necessary to

differentiate genes that are ‘passengers’ which are encompassed in SCNGs but are not involved in

the neoplastic process, from genes in SCNGs that are critical to tumor initiation and/or progression,

or so-called ‘driver genes’. Driver genes are positively selected for and are essential to cancer

development. As such, identification of driver genes in PDAC, and selective targeting of such genes

in the subset of tumors in which they are drivers, represents a potentially effective approach to

development of targeted therapies for this disease.

Driver genes harbored in SCNGs are expected to exhibit upregulated expression in the

tumor cells harboring the genetic gain. Integration of genomic and transcriptional profiles of the

same specimens is therefore valuable for honing potential target genes in defined regions of SCNG.

In addition, blockade of driver genes that are essential to the neoplastic process should conceivably

16

result in attenuation of the cellular processes that promote cancer formation and growth.

Consequently, the integration of function as measured by RNA-interference analyses with genomic

and transcriptomic profiles provides a suitable avenue to identify putative driver genes that can be

further studied in laboratory-based analyses to assess therapeutic potential.

2.2 Hypothesis

The primary hypothesis is that genes which are recurrently genomically amplified by copy

number gains, are upregulated, and are found to be essential to cancer cell viability in laboratory-

based analysis may be suitable therapeutic targets for further study. The secondary hypothesis is

that the genomic copy number gain may serve as a useful biomarker to identify patients who would

likely benefit from therapy targeting tumor-specific genetic abnormalities.

2.3 Project Aims 2.3.1 Identification of Coding Regions of Recurrent Copy Number Gain in Human Pancreatic Ductal Adenocarcinoma

Analysis of copy number data from primary pancreatic tumors and cell lines obtained from

publically available PDAC datasets was conducted in order to identify common regions of recurrent

copy number gain and genes mapping to these regions. This provided a repository of genes that can

be studied further as their genetic amplification has been observed in human pancreatic cancer.

2.3.2 Analysis of Candidate Gene List in an Independent Cohort of Human Pancreatic Ductal Adenocarcinoma Cell Lines Integration of gene expression data with the genetic information obtained from copy

number data analysis is a valuable approach to identifying candidate genes for further study. Some

genes may be identified as genetically amplified in pancreatic tumors but may not be expressed in

the tissue. Conversely, promising genes to further examine are those that display increased

expression level in the context of copy number gain at their respective locus.

Since expression data on the same tissue samples used for the initial copy number analysis

from publically available datasets was unavailable, an independent panel of PDAC cell lines was

used to quantify copy number and gene expression for all genes identified in the initial analysis.

17

This same panel of cell lines that are genetically and transcriptomically profiled can then serve as

tools for laboratory-based investigation of candidate genes. Furthermore, this same panel of PDAC

cell lines was utilized in a pooled shRNA functional genetic screen, and results of this screen were

subsequently used to corroborate findings for candidate genes for further study.

2.3.3 Assembling a Catalogue of Candidate Genes for Further Study The gene set obtained from analysis of copy number data from primary PDACs and cell lines

may contain potential driver genes, numerous passenger genes co-occurring on an amplicon with a

driver, as well as other potential ‘false-positive’ genes.

Using stringent filtering parameters to increase the statistical likelihood of capturing true

driver genes and differentiating between passengers, the list of candidate target genes was further

refined to a handful of genes that can serve as a catalogue of candidate genes for laboratory-based

study.

2.3.4 Modulation of a Candidate Target Gene by RNA-interference and Pharmacological Approaches

Laboratory-based targeted analyses are necessary to validate that aberration of cell lines

harboring genetic copy number gains of a gene, and its associated over-expression, in fact leads to

increased cell viability in comparison to cell lines for which the gene is not genetically gained. This

was accomplished by targeted shRNA-mediated interference of the target gene as well as

pharmacological inhibition of the cellular pathway in which the gene is involved.

18

2.4 Materials and Methods 2.4.1 Publically Available Pancreatic Ductal Adenocarcinoma Genome Datasets Somatic copy number gain data from 60 PDAC genomes from four independent pancreatic

cancer genome datasets were used to identify genes that are commonly gained in primary PDACs.

Details of each dataset as well as copy number analysis methods employed in each individual

dataset are summarized (Appendix Table A2).

2.4.2 Integrated Analysis of Pancreatic Cancer Genome Datasets

Regions of genomic gain were extracted from the data on 60 PDAC genomes, as identified in

each of four publically-available PDAC copy number datasets in Table 1 (QCMG, OICR, JHU, Harada).

The QCMG, OICR and JHU PDAC SCNG data were downloaded from the International Cancer

Genome Consortium Data Portal (ICGC, 2010). The PDAC SCNG data from the Harada dataset was

extracted from the Supplemental Information from Harada et al, 2009 (Harada et al., 2009). The

Python (v2.7) programming tool was utilized to parse each individual file into a common data

structure, whereby all of the regions of genomic gain are projected onto a reference genome build

(GRCh37/hg19). The output consisted of the UCSC gene name and ID of all genes in which a coding

region was encompassed in gains in at least three of the four datasets.

In addition, because true SCNA breakpoints may not accurately be delineated through

array-based methods employed by the original studies, all genes that were within 1Mb of the

minimal common region of overlap of genomic gains across the datasets were also included in the

gene set (expanded gene catalogue).

2.4.3 Copy Number Analysis of Candidate Genes in Human Pancreatic Ductal Adenocarcinoma Cell Lines

A panel of 29 PDAC cell lines was utilized to evaluate copy number of the genes in the

expanded catalogue (Appendix Table A3). Copy number analysis was performed using the Illumina

OmniExpress SNP array (Illumina, San Diego, CA). The raw signal obtained from the array is a ratio

of the intensity generated from the PDAC cell line sample relative to a reference sample, Log-R

Ratio (LRR):

LRR = log (Robserved/Rexpected)

19

Rexpected is calculated using a cluster file characterizing a reference set of samples, which are

identical for all arrays. The value of Rexpected is different for each SNP on the array. These analyses

were carried out at the same time to minimize batch effects such that LRRs can be compared across

the arrays.

For all genes in the analysis, a gene-directed approach to LRR estimation was formulated,

whereby the mean probe intensity (MPI) across all array probes mapping to each of the genes is

used to assign a continuous measure of copy number for that gene in each cell line. To compute the

MPI value for each gene, the R statistical programming tool (http://cran.r-

project.org/bin/windows/base/) was used to extract intensity measures of all array probes

mapping to a gene in the expanded catalogue, from transcription start to transcription end, and

then calculate the MPI of all probes mapping to each gene. A minimum of 10 SNP probes was

required for this analysis, and for smaller genes, the 10 probes in closest proximity to the gene

were used. To assess the validity of this approach, the MPI analysis was compared to the continuous

copy number estimates computed through the Circular Binary Segmentation algorithm (CBS)

(Olshen et al., 2004), and the results were consistent with CBS calculations. This resulted in a

continuous MPI measure for each gene in the expanded catalogue in each of the 29 cell lines, which

represents a continuous quantitative measure of copy number for each gene.

2.4.4 Gene Expression Analysis of Candidate Genes in Human Pancreatic Ductal Adenocarcinoma Cell Lines Gene expression analysis was performed using the Illumina HT-12v4 BeadChip expression array

(Illumina, San Diego, CA). The relative expression value for each gene in the expanded catalogue

was implemented using the Bioconductor LUMI package for R

(http://www.bioconductor.org/packages/2.0/bioc/html/lumi.html). The basic pipeline involved

background signal subtraction, quality control exploratory analysis, variance-stabilization using the

Variance Stabilized Transformation algorithm (Lin et al., 2008), and cross-array normalization

using Cyclic LOESS normalization. This algorithm performed pair-wise normalization for all

possible pairs of samples in the array and a resultant continuous measure of gene expression for all

genes.

20

2.4.5 Integrated Analysis of Copy Number and Gene Expression of Candidate Genes to Refine List of Putative Target Genes

Using the MPI measures for copy number and VST measures for expression for each of the

genes in the expanded catalogue in each of the 29 PDAC cell lines, a Spearman-Rho correlation

coefficient, ρ, was computed and subsequently used to identify genes in which MPI and VST were

most highly correlated across all cell lines. Genes for which the VST value was ≤6.8 (median VST

measure for all genes across all cell lines) were excluded. Genes for which the VST value in the cell

line expressing the gene at the highest level was at least 2.5 fold greater than the cell line

expressing the gene at the lowest level were selected for analysis. This was done in order to enable

stratification of cell lines into categories of ‘high relative expression’ and ‘low relative expression’

for each gene, as it correlates with copy number.

The remaining genes were then selected for inclusion in the candidate gene database if ρ ≥

0.65 which corresponds to a correlation coefficient p-value < 0.05 when compared to the

correlation coefficients in randomly simulated gene sets. Simulated gene sets were compiled to

simulate the selection criteria used in the candidate gene catalogue. Since the genes are clustered in

discrete loci, the simulated gene sets mimicked the same scenario and were generated as follows:

an anchor gene was first randomly selected, along with all genes within 500kb of the anchor gene.

This process continued until the number of genes in the simulated gene set was equal to the

number of genes in the candidate gene set. This simulation was performed in 1500 replicates.

2.4.6 Assembly of Pancreatic Ductal Adenocarcinoma Candidate Target Gene Database A working repository of the top candidate genes based on the aforementioned analyses and

filtered parameters was created (Table 3). Initial annotations including the gene cytoband and

correlation coefficient computed between copy number and expression in the panel of 29 PDAC cell

lines were included. In addition, a ‘DeltaExp’ column provides the log2 expression difference in the

cell line expressing the gene at the highest relative level and the cell line expressing the gene at the

lowest relative level. The ‘zGARP Score’ column lists the mean z-normalized gene activity rank

profile (GARP) of that gene in a pooled shRNA screen in 27 of the same PDAC cell lines used for

copy number and gene expression analysis. Annotations of ‘Druggability’ were added from the

Druggable Genome Database compiled as described in section 2.4.7 below. The ‘DrugBank’ column

provides information on small molecules which target that specific gene and are characterized in

21

DrugBank - a comprehensive repository of drug information (Knox et al., 2011). ‘NormalPancExp’

and ‘PancTumorExp’ columns indicate protein levels of each gene as documented by

immunohistochemical analysis in the Human Protein Atlas (Uhlen et al., 2010), and ‘Differential

Exp’ indicates if differential expression between normal pancreatic and pancreatic tumor tissue

was observed at the mRNA level (GeneCards, 2011). Finally, the column ‘Assays’ indicates any

potential assays, based on the predicted function of the gene, which can be performed to reliably

test the effects of perturbation of the gene.

2.4.7 Compilation of ‘Druggable Genome’ Database In order to add annotations of druggability to the candidate gene database it was necessary

to assemble a comprehensive repository of the ‘druggable genome’. This is the subset of the human

genome that expresses proteins that can bind drug-like molecules. Three druggable genome

datasets were merged for the compilation of the druggable genome database utilized in this study

(Russ and Lampel, 2005; Sophic, 2012; Yildirim et al., 2007). A gene is termed ‘druggable’ if it meets

at least one of the following criteria: (1) the gene product is known to bind drug molecules; (2) the

gene product can theoretically bind drug molecules because it belongs to a family of gene products

known to bind drug molecules (e.g. tyrosine protein kinases); (3) the gene product can theoretically

bind drug molecules because it contains protein domains that can theoretically bind small

molecules. Using these criteria and extensive manual curation of the three existing druggable

genome datasets, a database of druggable genes was generated.

2.4.8 Integration of RNA-interference Pooled Screen Studies to Identify Candidate Target Gene for Laboratory-Based Study Annotations of performance in an shRNA pooled screen on a panel of 28 PDAC lines by

Marcotte et al (all of which were also analyzed in this study), were added to the candidate gene

database. shRNA-mediated RNA-interference enables genome-wide loss-of-function screens, and as

such, a lentiviral-based shRNA library was used to facilitate genome-wide screening of cultured

cancer cells in a pooled format (Marcotte et al., 2012). Briefly, cells were infected with the shRNA

library targeting ~16 000 genes in a panel of 72 breast, pancreatic and ovarian cancer cell lines.

Integrating these results with the genomic and transcriptomic data in this study facilitated

systematic identification of genes which are essential to cell viability in the context of copy number

gains and over-expression. A mean z-normalized Gene Activity Rank Profile (zGARP) score from the

22

pooled screen for each gene was used to annotate the essentiality of each gene in the respective cell

line.

To assign a discrete predicted copy number value for the top candidate genes to formally

stratify genes by copy number and relate this to shRNA pooled screen performance, the Global

Parameter Hidden Markov Model (GPHMM) method was employed, as described by Li A, et al.,

2011 (Li et al., 2011). Genes with copy number ≥4 were grouped as ‘copy number gain’ in the

respective cell line, while genes with 2 copies were labeled diploid. A student’s t-test was used to

compare GPHMM copy number with RNAi pooled screen scores when the number of representative

cell lines was greater than or equal to 10. The Wilcoxon rank-sum test was used when the number

of representative cell lines was less than 10 in both groups (diploid and copy number gain).

2.4.9 Tissue Culture and Cell Lines Ten human pancreatic ductal adenocarcinoma cell lines were utilized for laboratory-based

studies. AsPc1, Capan-1, Capan-2, HPAF-II, MIA PaCa-2, Panc03.27, Panc04.03 and Panc08.13 were

purchased from the American Type Culture Collection (ATCC; Manassas, VA). The human pancreatic

ductal adenocarcinoma cell line KP4 was obtained from the Japan Health Sciences Foundation

(JHSF). The human pancreatic ductal adenocarcinoma cell line PATU8988S was generously

provided from Francisco Real (Madrid, Spain). HPAF-II, PATU8988S and MIA PaCa-2 were cultured

in Dulbecco’s Modified Eagle’s Medium (DMEM; Invitrogen, California), supplemented with 10%

fetal bovine serum (FBS; Hyclone, Utah) and 0.1mg/mL penicillin/streptomycin (Invitrogen,

California). Capan-1 was cultured in Iscove’s Modified Dulbecco’s Medium (IMDM; Invitrogen,

California), supplemented with 20% FBS and 0.1mg/mL penicillin/streptomycin. Capan-2 was

cultured in McCoy’s 5A Modified Medium (Invitrogen, California), supplemented with 10% FBS and

0.1mg/mL penicillin/streptomycin. Panc03.27 and Panc08.13 were cultured in Roswell Park

Memorial Institute (RPMI) 1640 Medium with 2mM L-gluatmine, 4.5g/L glucose, 10mM HEPES and

1.0mM sodium pyruvate (Invitrogen, California), supplemented with 10 Units/mL Human Insulin

(Wisent, Quebec), 15% FBS and 0.1mg/mL penicillin/streptomycin. Panc04.03 was cultured in

RPMI 1640 with 2mM L-gluatmine, 4.5g/L glucose, 10mM HEPES and 1.0mM sodium pyruvate,

supplemented with 20 Units/mL Human Insulin, 15% FBS and 0.1mg/mL penicillin/streptomycin.

AsPc1 was cultured in RPMI 1640 with 2mM L-gluatmine, 4.5g/L glucose, 10mM HEPES and 1.0mM

sodium pyruvate, supplemented with 10% FBS and 0.1mg/mL penicillin/streptomycin. KP-4 was

23

cultured in RPMI 1640, supplemented with 10% FBS and 0.1mg/mL penicillin/streptomycin. All

cell lines were cultured in a 5% CO2 humidified incubator at 37oC.

2.4.10 ECT2 and Control Lentivirus Production

Human embryonic kidney 293 SV40 large T-antigen (HEK293T) packaging cells were

cultured in DMEM, supplemented with 10% FBS and 0.1X penicillin/streptomycin for cell seeding.

For viral harvesting, high bovine serum albumin (BSA) 293T growth media was used (DMEM,

supplemented with 10% FBS, 1.1g/100mL BSA and 1X penicillin/streptomycin). Viral production

was carried out as outlined in the RNAi Consortium Lentiviral Production protocol. Briefly,

HEK293T cells were seeded at a density of 2.2x105 cells/mL in 6-well plates and incubated for 24

hours in 5% CO2 and 37oC. HEK293T packaging cells were then transfected with a mixture of 3

infection plasmids: packaging plasmid (pCMV-dR8.74psPAX2; 500ng/well), envelope plasmid (VSV-

G/pMD2.G; 50ng), and the hairpin-pLKO.1 vector containing the TRC library shRNA (500ng/well),

as well as OPTI-MEM serum-free media (Invitrogen, California), for a total volume of 30µL. All

shRNA constructs used are listed in Appendix Table A4.

The three-plasmid mix was then added to solution of TransIT-LT1 transfection reagent

(Mirus Bio, Madison WI) and incubated for 30 minutes at room temperature. The transfection mix

was then added to the packaging cells and left to incubate at 5% CO2 and 37oC for 18 hours. Media

was then changed to high-BSA growth media for viral harvests and cells were incubated for 24

hours at 5% CO2 and 37oC.

At approximately 40 hours post-transfection, media containing lentivirus was harvested

and again replaced with high-BSA media for subsequent viral harvests. Harvesting was repeated

after 24 hours. Media containing virus was centrifuged at 1250 rpm for 5 minutes to pellet any

packaging cells collected during harvesting, and the supernatant containing the virus was collected

at stored at -80oC.

2.4.11 Lentivirus Titration

Two titering experiments were performed: in the first experiment, HEK293 cells were

seeded into 96-well plates at a density of 2x104 cells/mL. After 24 hour incubation at 5% CO2 and

37oC, 8µg of polybrene (Millipore, Bellerica MA) was added to the cells, and cells were transduced

with either 5uL of virus or 15uL of virus, in triplicate, in DMEM and 10% FBS. After 24 hours, media

was changed to DMEM and 10% FBS containing 1µg/mL of puromycin and incubated for 24 hour at

24

5% CO2 and 37oC. Media was then changed to regular media and cells were incubated at 5% CO2

and 37oC. After 72 hours, WST1 reagent (Roche, California) was added to cells and cells were

incubated for 45 minutes and 450nm absorbance was then measured using a UV/Vis

Spectrophotometer Plate Reader (Biotek, Winooski VT). Absorbance was averaged across triplicate

wells and was normalized to identical plates to which no puromycin was added.

The second titering experiment involved the same protocol as above, but instead of HEK293

cells, human pancreatic ductal adenocarcinoma cell lines KP4, Capan-1, Capan-2, AsPc1, Panc04.03,

Panc03.27 and PATU8988S were utilized and cultured in their respective media as outlined in

2.4.10. A virus mixture of equal amounts of 5 ECT2 shRNAs as well as a GFPshRNA was added in 2-

fold serial dilutions (2µL-128µL).

2.4.12 Cell Viability Assay in shRNA Experiment Human pancreatic ductal adenocarcinoma cell lines AsPc1, Capan-1, Capan-2, KP4, HPAF-II,

Panc03.27, Panc04.03, Panc08.13, MIA PaCa-2 and PATU8988S were seeded in duplicate 96-well

plates at a cell density of 2000 cells/well and left to incubate at 5% CO2 and 37oC for 24 hours. After

incubation, 2uL of 4ug/uL of polybrene was added to all cells and to one plate of each cell line,

regular media was replaced with puromycin-containing media and incubated for 24 hours at 5%

CO2 and 37oC. Concentrations of puromycin used for each cell line are listed in Appendix Table A5.

After incubation, puromycin-containing media for each plate was then replaced with regular media

for each respective cell line and left to incubate for 72 hours at 5% CO2 and 37oC.

Cells were then rinsed twice with phosphate-buffered solution (PBS) and fixed with 4%

paraformaldehyde for 10 minutes. Following rinsing with PBS, cells were stained with Hoescht

(Invitrogen, California) and rinsed twice with PBS. Nuclei counts were obtained by analysis of

stained nuclei using the IN Cell Analyzer 2000 and analyzed on the IN Cell Developer Analyzer

Workstation 3.7 (GE Healthcare, Chalfont St Giles, United Kingdom). Nuclei counts were averaged

across triplicate wells and normalized to nuclei counts in the wells transduced with the shRNA-GFP

construct.

25

2.4.13 Pharmacologic Modulation Assay

Human pancreatic ductal adenocarcinoma cell lines AsPc1, Capan-1, Capan-2, KP4, HPAF-II,

Panc03.27, Panc04.03, Panc08.13, MIA PaCa-2 and PATU8988S were seeded in duplicate 96-well

plates at a cell density of 2000 cells/well and left to incubate at 5% CO2 and 37oC for 24 hours. After

incubation, cells were treated with either BI-6727 compound (Boehringer-Ingelheim, Ingelheim am

Rhein, Germany) or GSK461364 (Glaxo SmithKlein, Brentford UK) in 3-fold serial dilutions (30µM

to 0.01µM) or 0.3% DMSO control and incubated at 5% CO2 and 37oC for 72 hours. Details of each

compound are provided in Appendix Table A6. Following incubation, cells were then treated with

WST1 reagent for 45 minutes and 450nm absorbance was determined. Media absorbance was

subtracted from all readings and absorbance measures from triplicate wells were averaged and

normalized to absorbance of the DMSO control wells.

26

2.5 Results

2.5.1 Genes and Genomic Regions of Recurrent Somatic Copy Number Gains in Pancreatic Ductal Adenocarcinoma

Each of the four publically-available pancreatic cancer genome datasets used in this study

analyzed a varying number of samples and employed a different platform and associated algorithm

for calling SCNAs (Appendix Table A2). A comprehensive review of copy number alteration

detection platforms underscores the inherent variability in detecting SCNAs using various

techniques (Pinto et al., 2011). Bearing this limitation in mind, I analyzed only SCNAs that were

consistently observed in multiple datasets, as this would decrease the likelihood that the observed

SCNAs were technical artifacts. As such, the number of genes encompassed in regions of genomic

gain across multiple datasets was determined (Figure 3). The number of genes encompassed in

gains in two or more datasets comprised a large proportion of the genome (4617 genes). Only one

gene, IFLTD1, was encompassed in gains in all four datasets. This gene resides on the same

chromosomal locus as KRAS in the 12p12.1 region, and its gain across all four datasets could reflect

biological importance of KRAS as well as other nearby genes, including IFLTD1. In order to compile

a manageable list of candidate target genes, I chose to focus on genes that were encompassed in

gains in three of the four datasets. This integrated analysis of copy number gains revealed 171

genes encompassed in gains in at least one sample in three of the four pancreatic cancer SCNG

datasets.

Figure 3. Number of genes encompassed in genomic gains among multiple datasets. Number of genes encompassed in genomic gains in 2/4 datasets (leftmost bar), 3/4 datasets (middle bar), and all datasets (rightmost bar) are depicted.

Nu

mb

er

of

Ge

ne

s in

SC

NG

s

Number of Datasets

4617

171 1

27

To further validate the approach of identifying genes identified in SCNGs in 3/4 datasets,

analysis of the impact of exclusion of one dataset was performed. Close inspection of two of the five

samples of the OICR dataset appeared to potentially over-represent SCNGs as >20% of the genome

is called as gained in each of these two samples. Barring exclusion of this dataset from the analysis,

the optimal approach to assessing the extent to which this dataset is essential for the final analysis

was considered. Namely, the specific regions and total number of regions amplified was assessed

with both the inclusion and exclusion of the OICR dataset. The analysis reveals that the loci

identified as gained in 2 out of 3 datasets (i.e. excluding the OICR dataset) included all of the same

loci identified as gained in three out of four datasets (i.e. including the OICR dataset). This indicates

that exclusion of the OICR dataset does not impact the final determination of loci to be studied

further (Figure 4).

Figure 4. Number of genomic loci gained when assessing the datasets inclusive of the OICR dataset (AJH) and inclusive of the OICR dataset (AJHO). The leftmost bar indicates the number of genomic regions gained in all three of the AJH datasets, while the middle bar represents less stringent criteria. Namely, the number of genomic regions gained in at least 2 of the AJH datasets. By definition, all loci found gained in 3 out of the AJHO datasets (right bar) would have been identified in at least two of the AJH datasets (middle bar).

28

The set of 171 genes encompassed in SCNGs in 3 out of the four datasets comprised the

‘core catalogue’ of genes that were putative targets of SCNGs. However, because true boundaries of

SCNGs cannot be definitively delineated using the methods employed in the four PDAC SCNG

datasets, we chose to expand the core PDAC catalogue of 171 genes to include all genes in the

vicinity of each respective locus, and included all genes within 1Mb of the minimal common region

of overlap of the SCNG in each independent dataset. This resulted in an ‘expanded catalogue’ of 756

PDAC genes which were putative SCNG targets for further analysis (Figure 5).

Figure 5. Bioinformatic approach to identifying genes for further analysis. Chromosomal positions of regions of genomic gain were extracted from all datasets and converted to the same human genome assembly (GRCh37/hg19). Genes residing in these regions were then identified and a non-redundant combined list of genes that are found in at least one region of genomic gain in at least one sample in one of the four datasets. Of these genes, those appearing in regions of genomic gain in at least three of the four datasets were identified (171 genes). [QCMG=Queensland Center for Medical Genomics; ICGC=International Cancer Genome Consortium; OICR=Ontario Institute for Cancer Research; JHU PCGP=Johns Hopkins University Pancreatic Cancer Genome Project.]

29

The catalogue of PDAC genes encompassed in SCNGs in three of the four PDAC datasets map

to 20 discrete genomic loci that harbor a total of 756 genes (Table 1). The distribution of the

number of tumor samples harboring a genomic gain at each of the 20 genomic loci is depicted in

Figure 6. Genomic regions that appear to be most frequently gained are 20q11-20q12.31 (60.0%),

16p11-16p13.3 (58.3%), 12p11.21-12p13.33 (55.0%), 14q11 (48.3%).

Table 1. Genomic loci encompassed in SCNGs identified in this study.

Locus Size (Mb) Frequency in 60 PDAC Genomes

2q14-2q14.3 8.01 29 (48.3%)

3q25-3q26.3** 7.10 24 (40.0%)

7p15.2 1.62 28 (46.7%)

7p22-7p22.3 2.02 25 (41.6%)

8q24.2** 3.68 23 (38.3%)

9p13.3 3.16 9 (15.0%)

10q22-10q22.4** 1.07 25 (41.6%)

12p11-12p13.3** 12.16 33 (55.0%)

14q11-14q13** 1.51 30 (50.0%)

15q12-15q15 2.18 28 (46.7%)

15q24.2 1.18 23 (38.3%)

15q26.3 2.83 25 (41.6%)

16p13.3 2.22 35 (58.3%)

16q22-16q22.3 0.92 23 (38.3%)

18p11-18p11.21 0.82 8 (13.3%)

18q11-18q11.2 1.95 22 (36.6%)

19p13.3 1.41 6 (10.0%)

20p13 1.06 7 (11.7%)

20q11-20q13.3** 7.43 36 (60.0%)

Xq12-13** 0.791 6 (10.0%)

30

Figure 6. Circos plot depicting common regions of genomic gains. The loci identified in somatic copy number gains in at least three out the four PDAC datasets are depicted in the figure, mapped to the chromosomal region. The height of each bar represents the frequency of the genomic gain in the respective dataset. (OICR: Ontario Institute for Cancer Research; JHU: Johns Hopkins University; QCMG: Queensland Center for Medical Genomics).

The literature-curated data indicated that 17 of the 20 SCNG regions identified in this study have

been observed in previous PDAC copy number studies (Figure 7). Notably, for these loci, the

presumptive cancer-related driver genes within these SCNGs are yet to be identified.

31

Technology

Reference

2q

14

.3

3q

25

-q2

6

7p

15

.2

7p

22

.3

8q

24

.2

9p

13

.3

10

q2

2.2

12p

11.2

-13

14

q1

1.2

15

q1

3-1

4

15

q2

4

15

q2

6.3

16

p1

3.3

16

q2

2.1

18

p1

1.2

1

18

q1

1

19

p1

3.3

20

p1

3

20

q1

1-1

3

Xq

12

-13

Chromosomal

CGH

Solinas-Toldo

et al, 1996.

Mahlamaki et

al, 1997.

Fukushige et

al, 1997.

Curtis et al,

1998.

Ghadimi et

al, 1999.

Schleger et

al, 2000.

Shirasi et al,

2001.

Harada et al,

2002.

Mahlamaki et

al, 2002.

Lin et al,

2003.

Kitoh et al,

2005.

Array CGH

(BAC/PAC,

cDNA)

Heidenlblad

et al, 2004.

Aguirre et al,

2004.

Holzmann et

al, 2004.

Mahlamaki et

al, 2004.

Bashyam et

al, 2005.

Gysin et al,

2005.

Nowak et al,

2005.

Loukopoulos

et al, 2007.

Array CGH

(SNP-arrays)

Harada et al,

2008.

Figure 7. Comparison of 20 loci identified in this study with other pancreatic copy number studies in the literature. Literature references to PDAC copy number studies are listed in the ‘Reference’ column, and are grouped by the technology employed to call somatic copy number alterations (leftmost column). Of the 20 loci identified in our study, 17 have been identified in gains in at least one other PDAC study in the literature (depicted by green boxes).

32

Moreover, 7/20 of these genomic loci have been identified as frequent targets of genomic

gain or amplification in a survey of 3 131 tumor specimens belonging to 26 distinct tumor

histological subtypes (Figure 8; Table 2; [Beroukhim et al., 2010]).

Figure 8. Peak regions of genomic amplification identified in a survey of 3 131 tumor specimens belonging broadly to 26 histological subtypes (Beroukhim et al, 2010; Permission to reuse obtained; License No: 2891421438977). Chromosomal position is depicted on the vertical axis. The horizontal length of each peak indicates the statistical confidence that this peak is a true amplification peak.

Table 2. Regions of genomic gain identified in this analysis of pancreatic tumors as well as a survey of 26 histological subtypes of human cancer by Berkokhim et al, 2010.

Locus

3q26

8q24

10q22

12p11.21-p13.33

14q11.2

20q11-q13

Xq12

33

2.5.2 Integrated Copy Number and Expression Analysis of Candidate Genes

To further refine the PDAC gene catalogue, copy number and gene expression level

measures of genes in the expanded catalogue of 756 genes were assessed in an independent panel

of 29 human PDAC cell lines (Appendix Table A3). Array-based copy number and gene expression

data on these 756 genes was utilized in the analysis and genes were ranked by the correlation

measure between copy number and expression.

In order to correlate copy number and expression for the gene set, it was necessary to

derive continuous measures of each of these attributes. To assign a continuous measure of copy

number for each gene, a gene-directed approach (as opposed to a genome-wide approach) was

used, whereby the array intensity measures for all SNP probes on the array mapping to a gene were

averaged to provide an approximate measure of array intensity, mean probe intensity (MPI) as it

relates to putative copy number (Figure 9).

Figure 9. Mean probe intensity for assigning continuous copy number measure. SNP probe intensities for all probes mapping to each gene are averaged to assign a mean probe intensity continuous measure of copy number.

For each gene, the correlation between copy number and gene expression for that gene was

assessed by computing a Spearman-rho correlation coefficient, ρ, of these measures across the 29

PDAC cell lines. In addition, a Sum of Ranks was computed, which is a measure of the relation of the

top five cell lines with the highest MPI copy number measure for that gene and the bottom five cell

lines with the lowest MPI copy number measure. The lower the sum of ranks, the better the

correlation between copy number and gene expression in the 5 cell lines with the highest copy

number and the 5 cell lines with the lowest copy number. The sum of ranks was highly associated

with the Spearman-rho correlation coefficient of copy number and expression for each gene (Figure

10). Representative gene plots for copy number and gene expression correlations are shown in

Figure 11.

34

Figure 10. Association between Sum of Ranks and Spearman-rho Correlation Coefficient for PDAC genes. Sum of ranks measures are highly associated with the Spearman-rho Correlation Coefficients of copy number and expression for each gene in PDAC gene catalogue.

35

Figure 11. Representative copy number and gene expression correlation plots. Representative copy number and gene expression plots for four genes from the 756 gene set are shown. KRAS is a known oncogene in PDAC, MYC is a known oncogene in other human cancers and the other genes depicted, ECT2 and VCP, have not been characterized in PDAC. Red data points indicate the 5 cell lines with highest relative copy number and 5 cell lines with lowest relative copy number.

Among the genes in the set of 756 genes, those in the 95th percentile of correlation

coefficients have higher correlations between copy number and gene expression in comparison to

randomly simulated gene sets (p-value=0.007; Figure 12). Genes were selected for further

investigation if they were found to exhibit a correlation coefficient p-value < 0.05 when compared

with the correlation coefficients in randomly simulated gene sets, a 2.5-fold or greater difference in

expression between the cell lines expressing the gene at the highest and lowest levels and a

minimum expression measure of 6.8 in the cell line expressing the gene at the lowest level. The

36

minimum expression level was based on the median expression measure for all genes across all cell

lines and represents the lower limit of transcript detection.

Figure 12. Distribution of correlation coefficients in the top 5% most highly correlated genes in comparison to multiple simulations of random sets of genes. The top 5% genes in the gene catalogue having the highest correlation (ρ) values were compared with the top 5% genes with the highest ρ values in sets of randomly selected genes. The simulated gene set was compiled to simulate the selection criteria used in our candidate gene catalogue. Since our genes are clustered in discrete loci, our simulated gene set mimicked the same scenario: an anchor gene was first randomly selected, along with all genes within 1Mb of the anchor gene. This simulation was performed in 1500 replicates. The median ρ for the top 5% genes in the simulation set was 0.68 while the median ρ in the top 5% genes in our catalogue was 0.73 (p=0.007).

2.5.3 Database of Top-Ranked Candidate Target Genes

The filtering parameters in 2.5.2 resulted in a refined list of 34 candidate genes (CAN-

GENES) for further study, from the original list of 756 genes. These top-ranked genes primarily

mapped to 3q26.1-q26.3, 7p22, 9p13.1-p13.3, 12p11.21-pter, 14q11.2, 15q14-15, 15q26, 16p13,

18p11, 18q11.2, 19p13.3, 20p13, 20q11.2-q13. Of these genes, 26% (9/34) mapped to the 12p11-

12 region. Further annotations to these CAN-GENES were added to formulate a working database of

candidate targets. Annotations include druggability, as defined as the availability of small molecule

modulators or the presence of protein motifs which are potential drug-binding domains, and

available data on protein expression (Table 3). Genes in red font appear in the assembled druggable

genome database described in 2.4.7. Annotations including the gene cytoband and correlation

coefficient computed between copy number and expression in the panel of 29 PDAC cell lines were

37

included. In addition, a ‘DeltaExp’ column provides the log2 expression difference in the cell line

expressing the gene at the highest relative level and the cell line expressing the gene at the lowest

relative level. The ‘zGARP Score’ column lists the mean z-normalized gene activity rank profile

(GARP) of that gene in a pooled shRNA screen in 27 of the same PDAC cell lines used for copy

number and gene expression analysis. Annotations of ‘Druggability’ were added from the Druggable

Genome Database compiled as described in Methods section 2.4.7. The ‘DrugBank’ column provides

information on small molecules that target that specific gene and are characterized in DrugBank - a

comprehensive repository of drug information (Knox C et al., 2011). ‘NormalPancExp’ and

‘PancTumorExp’ columns indicate protein levels of each gene as documented by

immunohistochemical analysis in the Human Protein Atlas (Uhlen et al., 2010), and ‘Differential

Exp’ indicates if differential expression between normal pancreatic and pancreatic tumor tissue

was observed at the mRNA level (GeneCards, 2012).

38

Table 3. Database of top-ranked candidate PDAC genes.

Gene Cytoband Correlation DeltaExp zGARP ScoreDruggable DrugBank DrugsNormal Prot Panc Tumor Prot Differential Exp (GeneCard)

RECQL 12p12 0.835 1.832 -0.609 N n/a Weak (low protein expression)Negative-ModerateYes

TMEM85 15q14 0.832 2.333 0.654 N n/a n/a n/a n/a

VCP 9p13.3 0.825 1.638 -5.016 Y Phosphoaminophosphonic Acid-Adenylate Ester (DB04395); Adenosine-5'-Diphosphate (DB03431); Moderate-strongModerate-Strong Moderate

GOLT1B 12p12.1 0.811 2.641 -0.779 N n/a n/a n/a Moderate

CLTA 9p13 0.805 1.491 -0.27 N n/a Moderate Weak-Strong Yes

MELK 9p13.2 0.803 1.431 -1.53 Y n/a Strong Moderate-Strong Yes

ESCO1 18q11.2 0.802 1.475 0.673 N n/a Strong Moderate-Strong n/a

SELS 15q26.3 0.779 1.616 n/a N n/a n/a n/a n/a

TMEM55B 14q11.2 0.768 2.474 n/a N n/a n/a n/a n/a

MED21 12p11.23 0.766 1.348 -0.323 N n/a Weak Weak-Moderate (Strong in CAPAN2)Yes

CMAS 12p12.1 0.758 2.864 0.435 Y Cytidine-5'-Monophosphate-5-N-Acetylneuraminic Acid (DB02485); Strong Weak-Strong Moderate

CCDC91 12p11.22 0.741 2.209 -0.362 N n/a Strong Weak-Strong Moderate

KIAA0528 12p12.1 0.739 2.121 0.278 N n/a n/a n/a No

WDR18 19p13.3 0.736 2.325 0.484 N n/a n/a n/a No

CHMP4B 20q11.22 0.732 2.659 n/a N n/a n/a n/a n/a

UBAP2 9p13.3 0.732 1.425 -0.203 N n/a Negative Weak-Strong No

RHBDF1 16p13.3 0.729 2.326 n/a Y n/a n/a n/a Moderate

AMN1 12p11.21 0.722 1.906 0.098 Y n/a n/a n/a n/a

CSNK2A1 20p13 0.718 1.369 0.963 Y Benzamidine (DB03127); S-METHYL-4,5,6,7-TETRABROMO-BENZIMIDAZOLE (DB04720); Phosphoaminophosphonic Acid-Adenylate Ester (DB04395); Tetrabromo-2-Benzotriazole (DB04462); 2-(CYCLOHEXYLMETHYLAMINO)-4-(PHENYLAMINO)PYRAZOLO[1,5-A][1,3,5]TRIAZINE-8-CARBONITRILE (DB08354); 2,3,7,8-tetrahydroxychromeno[5,4,3-cde]chromene-5,10-dione (DB08468); 2-(4-ETHYLPIPERAZIN-1-YL)-4-(PHENYLAMINO)PYRAZOLO[1,5-A][1,3,5]TRIAZINE-8-CARBONITRILE (DB08360); N-(3-(8-CYANO-4-(PHENYLAMINO)PYRAZOLO[1,5-A][1,3,5]TRIAZIN-2-YLAMINO)PHENYL)ACETAMIDE (DB08362); 5,6-dichloro-1-beta-D-ribofuranosyl-1H-benzimidazole (DB08473); 3,8-DIBROMO-7-HYDROXY-4-METHYL-2H-CHROMEN-2-ONE (DB07802); 3-METHYL-1,6,8-TRIHYDROXYANTHRAQUINONE (DB07715); 1,2,5,8-tetrahydroxyanthracene-9,10-dione (DB08660); (5-Oxo-5,6-Dihydro-Indolo[1,2-a]Quinazolin-7-Yl)-Acetic Acid (DB01765); DIMETHYL-(4,5,6,7-TETRABROMO-1H-BENZOIMIDAZOL-2-YL)-AMINE (DB04719); 1,8-Di-Hydroxy-4-Nitro-Anthraquinone (DB03035); 1,8-Di-Hydroxy-4-Nitro-Xanthen-9-One (DB02170); 5,8-Di-Amino-1,4-DihydrStrong Weak-Strong Yes

EIF2AK1 7p22 0.712 1.470 0.858 Y n/a Medium (protein expression of normal tissue)Weak-Strong Yes

PSMG2 18p11.21 0.709 2.145 0.126 N n/a n/a n/a Yes

WNK1 12p13.3 0.702 1.434 -1.201 Y n/a n/a n/a No

NOP10 15q14-q15 0.698 2.069 1.132 N n/a n/a n/a n/a

STOML2 9p13.1 0.697 1.578 0.832 Y n/a Moderate Negative-moderateYes

ECT2 3q26.1-q26.2 0.695 3.940 -3.195 Y n/a n/a n/a Yes

LYRM5 12p12.1 0.695 2.211 0.313 N n/a n/a n/a n/a

RALY 20q11.21-q11.230.688 1.751 -1.403 N n/a n/a n/a No

FKBP1A 20p13 0.683 2.472 -0.017 Y Phosphoaminophosphonic Acid-Adenylate Ester (DB04395); (21S)-1AZA-4,4-DIMETHYL-6,19-DIOXA-2,3,7,20-TETRAOXOBICYCLO[19.4.0] PENTACOSANE FKB-001 (DB02888); L-709,587 (DB03621); {3-[3-(3,4-Dimethoxy-Phenyl)-1-(1-{1-[2-(3,4,5-Trimethoxy-Phenyl)-Butyryl]-Piperidin-2yl}-Vinyloxy)-Propyl]-Phenoxy}-Acetic Acid (DB01723); Rapamycin Immunosuppressant Drug (DB02439); Tacrolimus (DB00864); 4-Hydroxy-2-Butanone (DB04094); (3r)-4-(P-Toluenesulfonyl)-1,4-Thiazane-3-Carboxylicacid-L-Leucine (DB04012); Gpi-1046 (DB01951); (21S)-1AZA-4,4-DIMETHYL-6,19-DIOXA-2,3,7,20-TETRAOXOBICYCLO[19.4.0] PENTACOSANE (DB08520); Heptyl-Beta-D-Glucopyranoside (DB03338); N1,N2-ETHYLENE-2-METHYLAMINO-4,5,6,7-TETRABROMO-BENZIMIDAZOLE (DB04721); (3r)-4-(P-Toluenesulfonyl)-1,4-Thiazane-3-Carboxylicacid-L-Phenylalanine Ethyl Ester (DB01712); 6-[4-(2-piperidin-1-ylethoxy)phenyl]-3-pyridin-4-ylpyrazolo[1,5-a]pyrimidine (DB08597); Pimecrolimus (DB00337); Sirolimus (DB00877); MYRISTIC ACID (DB08231); Weak Negative Yes

FGFR1OP212p11.23 0.681 1.735 0.092 Y n/a High Moderate-Strong n/a

GSS 20q11.2 0.681 1.435 -0.673 Y Gamma-Glutamylcysteine (DB03408); Glycine (DB00145); L-Cysteine (DB00151); Adenosine-5'-Diphosphate (DB03431); Glutathione (DB00143); Phosphoaminophosphonic Acid-Adenylate Ester (DB04395); n/a n/a No

CCDC77 12p13.33 0.680 2.266 0.074 N n/a Moderate Weak-Moderate n/a

ZSWIM1 20q13.12 0.677 1.544 0.034 N n/a n/a n/a n/a

RPS15 19p13.3 0.663 1.833 -2.381 Y n/a n/a n/a No

AFG3L2 18p11 0.658 3.056 -1.458 Y n/a Medium Moderate-Strong No

39

2.5.4 Identification of ECT2 for Laboratory Study through Integration of shRNA Pooled Screen Results

Genomic copy number and global expression data do not indicate whether or not a gene is

essential for cancer cellular pathways and networks, and as such, I sought to compare genomic

information with existing functional genetic screening data from Marcotte et al, whereby shRNAs

targeting ~16 000 genes were tested in a pooled screen on 27 of the same cell lines analyzed for

SCNGs and expression in this study (Marcotte et al, 2012). Thus, functional genetic screening data

for the set of CAN-GENES in PDAC was integrated with copy number and gene expression data to

further refine the list of putative drive genes. Among the 34 top-ranked genes in the CAN-GENES

list, 7 genes including, VCP, ECT2, RPS15, MELK, AFG3L2, RALY, and WNK1, were found to be

essential for PDAC cell viability, as measured by a median z-normalized GARP (zGARP) score lower

than -1 across all pancreatic cancer cell lines (Figure 13).

40

Figure 13. shRNA pooled screen results for top-ranked candidate genes. Heatmap depicts z-normalized GARP (essentiality) scores of 34 CAN-GENES obtained from shRNA pooled screen analysis in 27 human pancreatic ductal adenocarcinoma cell lines. Left vertical axis indicates the PDAC cell line tested.

Next, I sought to determine if these 7 genes were essential to PDAC viability across all PDAC

cell lines or display relatively higher essentiality in cell lines that harbor copy number gains at their

respective loci. In other words, in the subset of PDAC cell lines in which the gene is essential, is it

also amplified and vice versa. In order to formally assess this, it was necessary to assign a discrete

copy number for each of the candidate genes across the panel of PDAC cell lines. Copy number data

on the panel of 29 PDAC cell lines was processed through the Global Parameter Hidden Markov

Model (GPHMM) method, a standard algorithm for array-based copy number analyses (Li A et al.,

2011). Genes with copy number ≥4 were grouped as ‘copy number gain’ in the respective cell line,

while genes with 2 copies were labeled diploid.

41

Data from the 7 essential genes were depicted in Figure 14. Notably, only ECT2, showed a

positive correlation between copy number, expression and zGARP across the 27 PDAC cell lines,

indicating that higher essentiality is observed in cell lines bearing ECT2 SCNGs, (p=0.015; Figure

14a). This indicates that while overall, the enriched gene set consists of genes that are essential to

PDAC cell viability, or ‘Pancreatic essentiality’ it appears that only ECT2 displays ‘Pancreatic gain-

specific essentiality’, demonstrating that increased essentiality may be a direct result of copy

number gain and upregulated expression of this gene.

Figure 14a. Comparison of essentiality scores of ECT2 in PDAC cell lines with copy number gains and cell lines in which ECT2 is diploid. p-values denoted at the top of the plots for each gene indicate the degree of significance between differences in RNAi pooled screen z-normalized GARP scores (zGARP) in cell lines where ECT2 is in the diploid state in comparison to cell lines harboring genetic gains at the 3q26 locus.

42

Figure 14b. Comparison of essentiality scores of PDAC essential genes with copy number gains. Boxplots show genes that are overall essential across the 27 surveyed PDAC cell lines. p-values denoted at the top of the plots for each gene indicate the degree of significance between differences in RNAi pooled screen z-normalized GARP scores (zGARP) in cell lines where the gene is in the diploid state in comparison to cell lines harboring genetic gains at the respective locus.

43

These results, coupled with the fact that the 3q26 locus harboring ECT2 was found gained

or amplified in 24/60 (40%) of PDACs utilized in the initial analysis of copy number gains in human

PDACs using public datasets, indicated that gains at this locus may be recurrent genomic events

important in PDAC tumor progression. Moreover, 16/27 (59.3%) of the human PDAC cell lines

analyzed for essentiality in the pooled shRNA screen harbored SCNGs at the 3q26 locus, and thus a

large number of cell lines modeling gains at this locus were available for functional validation

experiments.

2.5.5 Targeted shRNA studies of ECT2 in Pancreatic Ductal Adenocarcinoma Cell Lines As outlined in 2.5.4, an shRNA-based analysis of the same PDAC cell lines analyzed in this

study was conducted as part of a functional genomic study of cell lines utilizing a pooled shRNA

screen (Marcotte et al., 2012). Results of the screen demonstrate a trend between ECT2 copy

number gain, expression, as well as essentiality to cell viability as measured by a z-normalized

GARP (zGARP) score of shRNA pooled screen performance (Figure 15). These results, in addition

with the formal analysis between shRNA pooled screen performance and copy number in 2.5.4

prompted targeted validation of the role of ECT2 in PDAC cellular biology.

44

Figure 15. Histogram representation of copy number, expression and shRNA pooled screen performance for ECT2. Individual bars of the histogram represent data for all PDAC cell lines analyzed in a pooled shRNA screen (Marcotte et al., 2012). Below each cell line name is a symbol to indicate presence (+) or absence (-) of ECT2 copy number gain in each respective cell line. The height of each bar represents the expression level of ECT2 in the cell line. The color of each bar, as outlined in the legend, depicts the shRNA pooled screen zGARP value for ECT2 in the cell line.

Since ECT2 is implicated in cell proliferation, a critical experiment should to assess the

extent to which ECT2 genetic gain is associated with increased dependency on this protein for cell

viability. To assess this, five shRNAs that target ECT2, as well as appropriate controls from The

RNAi Consortium, were used in a targeted assay of the effect of ECT2 interference on cell viability

(Moffat et al., 2006). A panel of 10 of the same cells lines utilized in the genetic and transcriptomic

analysis in this study was selected for analysis. Of this panel of 10 cell lines, 5 harbor focal copy

number gains encompassing 3q26 (Capan-1, Capan-2, HPAF-II, Panc03.27, PATU8988S), 2 cell lines

bear arm-level gains of 3q (MIAPaCa-2, KP4), 1 cell line has a whole chromosome 3 gains

(Panc08.13), 1 cell line is copy neutral (diploid) for chromosome 3 (AsPc1) and 1 cell line has a one

copy loss in the 3q26 genomic region (Panc04.03). These estimates of copy number were

determined by processing the array-based copy number data through the GPHMM algorithm (Li et

al., 2011). Moreover, these results were entirely concordant with the copy number estimates

45

obtained through the mean probe intensity (MPI) measures I derived from the copy number data

(Table 4). The MPI measures are, in turn, nearly identical to the continuous measures obtained

when processing the data through the standard copy number analysis method of Circular Binary

Segmentation (CBS), demonstrating that the estimates of copy number are likely very precise

(Table 4, Figure 16). In addition, a recent study of 947 human cancer cell lines, termed the Cancer

Cell Line Encyclopedia, also profiled genomic copy number of the 10 cell lines utilized in this study,

and the results are highly concordant with this study (Table 5; [Barretina et al., 2012]).

Copy number plots for all cell lines used in this analysis are depicted in Figure 17a-e and

these plots are compared with copy number plots for the same cell lines derived as part of the

Wellcome Trust Sanger Institute’s Cancer Cell Line Project

(http://www.sanger.ac.uk/genetics/CGP/CellLines/).

Table 4. Results of copy number measures for ECT2 in cell lines utilized for targeted shRNA analyses obtained through different computational methods.

Cell Line ECT2 MPI*

ECT2 CBS**

ECT2 GPHMM

Copy Number†

Capan-2 0.53727 0.4164 5

Capan-1 0.40464 0.3438 4

HPAF-II 0.37382 0.3026 5

PaTu8988S 0.35131 0.2987 5

Panc03.27 0.33849 0.2302 4

MIA-PaCa-2 0.2137 0.1582 4

Panc08.13 0.17869 0.1596 3

KP-4 0.06285 0.0743 3

AsPc1 -0.10007 -0.0685 2

Panc04.03 -0.12725 -0.1213 1

*Results of analysis in this study. **Olshen et al., 2004.

†Li et al., 2011.

46

Figure 16. Comparison of Mean Probe Intensity (MPI) copy number estimation approach with Circular Binary Segmentation (CBS) for copy number estimation of ECT2. MPI measures utilized in this study are compared with measures for copy number obtained by processing the same data through the CBS algorithm described by Olshen et al., 2004.

Table 5. Copy number analysis of pancreatic cancer cell lines in Barretina J, et al. 2012. Cell lines are ordered by ‘seg.mean’ measure of copy number (Barretina J, et al. 2012).

CCLE_name chrom loc.start loc.end num.mark seg.mean

CAPAN2_PANCREAS 3 173044302 176065417 2001 1.0032

CAPAN1_PANCREAS 3 164028410 190843448 16634 0.8092

HPAFII_PANCREAS 3 168371964 176150140 4901 0.7195

PANC0327_PANCREAS 3 169898702 175222564 3439 0.6604

PATU8988S_PANCREAS 3 169655701 176185849 4214 0.613

PANC0813_PANCREAS 3 169779880 179474606 6273 0.3523

MIAPACA2_PANCREAS 3 168320734 190145372 13735 0.3026

KP4_PANCREAS 3 172208475 174469772 1516 0.1973

ASPC1_PANCREAS 3 75760125 194361700 68972 -0.2326

PANC0403_PANCREAS 3 163625382 174712073 6661 -0.2937

47

Figure 17a. Copy number plots for ECT2 in cell lines with focal 3q26 gains. Each plot depicts copy number data for the respective cell lines (Capan-1, Capan-2, HPAF-II, Panc03.27, PATU8988S). The horizontal axis is the chromosomal position on chromosome 3 and the vertical axis represents the LogR Ratio value for all probes in that region. Dashed red lines on each plot represent the chromosomal position of ECT2 and dashed blue lines represent a LogR ratio of 0, or an approximate copy neutral state. The top panel is the cell line data analyzed in this study and the bottom panel (where available) is the copy number plot for chromosome 3 analyzed in the Wellcome Trust Sanger Institute’s Cancer Cell Line Project.

48

Figure 17b. Copy number plots for ECT2 in cell lines with arm-level 3q gains. Each plot depicts copy number data for the respective cell lines (KP-4, MIAPaCa-2). The horizontal axis is the chromosomal position on chromosome 3 and the vertical axis represents the LogR Ratio value for all probes in that region. Dashed red lines on each plot represent the chromosomal position of ECT2 and dashed blue lines represent a LogR ratio of 0, or an approximate copy neutral state. The top panel is the cell line data analyzed in this study and the bottom panel is the copy number plot for chromosome 3 analyzed in the Wellcome Trust Sanger Institute’s Cancer Cell Line Project.

Figure 17c. Copy number plots for ECT2 in Panc08.13 harboring whole chromosome 3 gain. The horizontal axis is the chromosomal position on chromosome 3 and the vertical axis represents the LogR Ratio value for all probes in that region. Dashed red line represents the chromosomal position of ECT2 and the dashed blue line represent a LogR ratio of 0, or an approximate copy neutral state. The top panel is the cell line data analyzed in this study and the bottom panel is the copy number plot for chromosome 3 analyzed in the Wellcome Trust Sanger Institute’s Cancer Cell Line Project.

49

Figure 17d. Copy number plots for ECT2 in AsPc1 with a diploid/copy neutral chromosome 3. The horizontal axis is the chromosomal position on chromosome 3 and the vertical axis represents the LogR Ratio value for all probes in that region. Dashed red line represents the chromosomal position of ECT2 and dashed blue line represent a LogR ratio of 0, or an approximate copy neutral state. The top panel is the cell line data analyzed in this study and the bottom panel is the copy number plot for chromosome 3 analyzed in the Wellcome Trust Sanger Institute’s Cancer Cell Line Project.

Figure 17e. Copy number plots for ECT2 in Panc04.03 with a one copy loss of ECT2. The horizontal axis is the chromosomal position on chromosome 3 and the vertical axis represents the LogR Ratio value for all probes in that region. Dashed red line represents the chromosomal position of ECT2 and dashed blue line represents a LogR ratio of 0, or an approximate copy neutral state.

50

Using five shRNAs to target ECT2, along with three negative control shRNAs (LACZ, GFP and

LUC) and two positive control shRNAs (PSMD1, SNRPD1), the effects of shRNA-mediated ECT2

interference on cell viability (nuclei counts) were assayed. Cells were treated with lentivirus

containing the shRNA constructs in duplicate assays: one assay plate was treated with puromycin to

select for cells that effectively take-up lentivirus, since the lentiviral plasmid DNA contains a

puromycin resistance gene and puromycin is otherwise toxic to mammalian cells in culture; the

other assay plate was not treated with puromycin. Effective puromycin selection can be inferred

from the cell viability of the cells not treated with lentivirus, or the ‘No Virus’ control. Cells not

treated with lentivirus containing a puromycin resistance gene should be eradicated by puromycin

treatment, and cells that are not treated with either lentivirus or puromycin should demonstrate

substantially higher cell viability. This represents an internal experimental control for puromycin

selection of cells that effectively take up the lentivirus containing the respective shRNA construct,

such that cell viability measures are not confounded by existing cells that did not effectively take up

the shRNA construct.

Results from each cell line are depicted in Figure 18a-j. Nuclei counts are normalized to the

GFP negative control shRNA as this shRNA construct is known to have the least effects on cell

growth, relative to any other control shRNA. In cell lines bearing ECT2 copy number gains (Figure

18a-e; Capan-2, Capan-1, HPAF-II, Panc03.27 and PATU8988S), there appears to be only minor

commonality between susceptibility to shRNA-mediated interference and presence of ECT2 copy

number gain. In the Capan-2 cell line with a focal copy number gain encompassing ECT2, all ECT2

shRNAs appear to diminish cell viability to some extent, however only ECT2-1, ECT2-2 and ECT2-5

shRNAs yield statistically significant decreases in cell viability, relative to GFP (p=2.55e-5, 0.01,

0.0015, respectively). In contrast, cell lines Capan-1 and HPAF-II which also bear focal genomic

gains at the ECT2 locus, only the ECT2-1 shRNA efficiently decreases cell viability (Capan-1:

p=0.0027; HPAF-II: p=0.0011). In the Panc03.27 cell line bearing a gain from 3q26-qter

encompassing ECT2, all ECT2 shRNAs effectively decrease cell viability and this is statistically

significant for all shRNAs (p<0.05). Finally, for PATU8988S, only ECT2-1 and ECT2-2 shRNAs

significantly decrease cell viability (p<0.001), but not the other ECT2 shRNAs.

51

ECT2-1 ECT2-2 ECT2-3 ECT2-4 ECT2-5 GFP

(18a)


(18b)

**

* *

*

52



(18c)

(18d)

*

**

* * *

**

53

Figure 18a-e. Targeted shRNA-mediated interference of ECT2 in PDAC cell lines with focal 3q26 copy number gains. Panels a-e depict results from targeted shRNA analysis in 5 PDAC cell lines bearing focal genomic gains at the 3q26 locus harboring ECT2. Black bars represent normalized nuclei counts in cells treated with the respective shRNA construct and puromycin; bars with diagonal fill represent normalized nuclei counts in cells treated with the respective shRNA construct but no puromycin. ECT2-1-ECT2-5 are five individual constructs targeting ECT2. LAC-Z, GFP and LUC constructs are negative control shRNAs while PSMD1 and SNRPD1 constructs are positive control shRNAs. Complimenting images of stained nuclei are represented in each panel for puromycin-treated cells with ECT2-1 – ECT2-5 shRNAs are well as GFP shRNA (*p<0.05 relative to shGFP; **p<0.001 relative to shGFP).

The cell lines bearing arm-level 3q copy number gains appear to generally be susceptible to

ECT2 inhibition by shRNA (Figure 18f-g; MIA PaCa2, KP4). In the MIA PaCa2 cell line, only two

shRNAs, ECT2-1(p<0.001) and ECT2-5 (p=0.0017), reduced cell viability. Similarly, in the KP4 cell

line, ECT2-1 and ECT2-5 (p<0.001) shRNA diminish cell viability but in addition, ECT2-2 and ECT2-

3 also decrease cell viability (p<0.001). However, it is important to note that in the KP4 cell line,

growth was not inhibited by the positive control shRNA constructs, taking the results from this cell

line (performed in two technical replicates) into question.


(18e)

** **

54

Figure 18f-g. Targeted shRNA-mediated interference of ECT2 in PDAC cell lines arm-level 3q gains. Panels f and g depict results from targeted shRNA analysis in 2 PDACcell lines arm-level 3q copy number gains. Black bars represent normalized nuclei counts in cells treated with the respective shRNA construct and puromycin; bars with diagonal fill represent normalized nuclei counts in cells treated with the respective shRNA construct but no puromycin. ECT2-1-ECT2-5are five individual constructs targeting ECT2. LAC-Z, GFP and LUC constructs are negative control shRNAs while PSMD1 and SNRPD1 constructs are positive control shRNAs. Complimenting images of stained nuclei are represented in each panel for puromycin-treated cells with ECT2-1 – ECT2-5 shRNAs are well as GFP shRNA (*p<0.05 relative to shGFP; **p<0.001 relative to shGFP).



(18f)

(18g)

**

*

**

**

**

**

55

Effects of shRNA-mediated ECT2 interference are indeed generally less pronounced in the

cell lines that do not harbor focal ECT2 copy number gains (Figure 18h-j). The Panc08.13 cell line

bears a whole chromosome 3 gain and a near triploid genome and AsPc1 has a diploid chromosome

3. None of the ECT2 shRNAs attenuated cellular growth and viability in either of these cell lines,

although in both cell lines the ECT2-1 shRNA had the greatest effect on cell viability but this was not

statistically significant. Finally, in the Panc04.03 cell line which bears a one copy loss of ECT2, only

the ECT2-1 shRNA significantly reduced cell viability (p<0.001).


(18h)

56

Figure 18h-j. Targeted shRNA-mediated interference of ECT2 in PDAC cell lines not bearing ECT2 gains. Panels h-j depict results from targeted shRNA analysis in 3 PDAC cell lines. Black bars represent normalized nuclei counts in cells treated with the respective shRNA construct and puromycin; bars with diagonal fill represent normalized nuclei counts in cells treated with the respective shRNA construct but no puromycin. ECT2-1-ECT2-5are five individual constructs targeting ECT2. LAC-Z, GFP and LUC constructs are negative control shRNAs while PSMD1 and SNRPD1 constructs are positive control shRNAs. Complimenting images of stained nuclei are represented in each panel for puromycin-treated cells with ECT2-1 – ECT2-5 shRNAs are well as GFP shRNA (*p<0.05 relative to shGFP; **p<0.001 relative to shGFP).



(18j)

(18i)

**

57

The next analytical step was to formally compare the results from these targeted

experiments with the results for ECT2 represented the pooled shRNA screen on the same cell lines

(Marcotte et al., 2012). ‘Essentiality’ of a gene to cell viability in a pooled shRNA screen is

determined by a Gene Activity Rank Profile (GARP) score for each individual gene. The GARP score

is derived from the shRNA Activity Rank Profile (shARP) scores for the two shRNAs that most

effectively diminish transcript levels over a defined time-course. These top two shRNAs are

therefore the ‘fastest drop-outs’ and their shARP scores are averaged to provide the GARP score for

the gene. The top two fastest drop-out shRNAs targeting a given gene may not necessarily always

be the same two shRNAs in all cell lines. In other words, the same two hairpins are not the fastest

dropouts across all cell lines.

In order to directly compare the results from this study to the results from the shRNA

pooled screen, the top two shRNA constructs (fastest drop-outs) identified in the screen are utilized

for the comparison as these are the same shRNAs that were used in the pooled screen to compute

the Gene Activity Rank Profile (GARP) scores and therefore measures of essentiality depend on the

effects of these two shRNAs on cell viability. As mentioned, the identity of both shRNAs is not the

same in all cell lines, however the fastest dropout in all cell lines in the shRNA pooled screen, ECT2-

1, also demonstrated the strongest effect on cell viability in this study (Figure 19). Of importance,

effects on nuclei counts observed with shRNA-1 (ECT2-1) are generally not always consistent with

the effects observed with the second-fastest drop-out, shRNA-2 (Figure 19). In addition, a clear

observation was that ECT2-1 has strong effects on cell viability in most cell lines, and the average

effect of shRNA-1 and shRNA-2 (the GARP score) may be heavily weighted towards effects of

shRNA-1. Data on the ECT2-1 shRNA construct from The RNAi Consortium indicate that this

construct least efficiently knocks-down ECT2, when compared to the other four ECT2 shRNAs, as

measured by percent mRNA remaining after knock-down (Appendix Table A4), and has numerous

off-target effects. This suggests that the strong effects on cell viability secondary to shRNA-1 may

not only be as a result of effects on ECT2 inhibition.

58

Figure 19. Targeted shRNA experiment results. Height of the blue bars represents effects on cell viability with shRNA-1 (ECT2-1), the fastest dropout in all cell lines. Height of the pink bars represents effects on cell viability with the second-fastest dropout shRNA, which is not the same among all cell lines. Annotations of ECT2 copy gain are below the cell line names.

The culmination of these results leads to the prominent observation that the ECT2-1 shRNA

appears to have the most pronounced and statistically significant effect on cell viability in all cell

lines tested, in comparison to the four other ECT2 shRNAs tested in this study, as well as the next

fastest drop-out shRNA in the pooled screen. The implications of this are that the GARP scores for

ECT2 in the pooled screen may be driven by the strong effect of the ECT2-1 shRNA since these

scores are the averages of ECT2-1 and the next fastest drop-out shRNA.

Overall, the findings from this study are somewhat in agreement with results from the

pooled screen by Marcotte et al. (Figure 20). That is, in cell lines in which ECT2 was deemed

essential in the pooled screen (z-normalized GARP score <-3), this study showed consistent results,

indicated by decreased cell viability with shRNA-mediated ECT2 interference.

59

Figure 20. Comparison of targeted shRNA-mediated ECT2 interference with shRNA pooled screen results. The bottom half of the graph depicts z-normalized GARP scores from the shRNA pooled screen and data is depicted on cell lines having highest ECT2 essentiality to lowest ECT2 essentiality (left to right), as indicated by negative zGARP scores. The top half of the graph depicts the results from this study. The pink bars represent ECT2-1 shRNA-1 in all cell lines, while the green bar represents the next fastest shRNA drop-out construct and is not the same construct across all lines (shRNA-2). Actual values are outlined in the table below the graph. One of the cell lines tested in the targeted analysis was not included in the pooled shRNA

screen (Capan-1), and of the 9 remaining cell lines tested, results from 6 cell lines in this study were

generally concordant with the pooled screen results. This indicates that in these 6 cell lines, results

from the top two shRNAs can reliably predict essentiality observed in the pooled screen (Table 5;

data for which the results of this study were discordant with the pooled screen analysis are

highlighted in red). In the three cell lines for which the results from this study are discordant with

the results from the pooled screen (HPAF-II, Panc08.13, Panc04.03), it is evident that the pooled

screen classification of ECT2 as essential in these lines is heavily weighted by the effects of shRNA-1

(ECT2-1). Interestingly, for the cell line characterized by one copy loss of ECT2, Panc04.03, the

pooled screen classified ECT2 essential in this line, however, while shRNA-1 (ECT2-1) resulted in

statistically significant decreases in cell viability, shRNA-2 resulted in statistically significant

increases in cell viability. Thus conceivably, the average of these two extreme measures resulted in

a GARP score translating into the classification of ECT2 as essential in this cell line, when in fact,

60

results from only one of the shRNAs suggest that ECT2 knock-down (if completely efficient by this

shRNA), may potentially confer growth advantage in this cell line, as opposed to reduce cell

viability.

Table 6. Comparison of targeted shRNA analysis nuclei counts with results from pooled shRNA screen (shRNA-1 and shRNA-2 counts are results from this study).

Taken together, it is unclear whether ECT2 copy number gains can reliably predict effects of

shRNA-mediated ECT2 interference on cell viability. There is indeed a potential trend towards

increased essentiality in relation to copy number gains. However, more cell lines must be tested

with proven efficient shRNAs to more reliably assess the extent of the association between ECT2

copy number gains and increased dependence on ECT2 for cell viability.

61

2.5.6 Functional Effects of Pharmacological Inhibition of the ECT2 Pathway on Cell Viability One of the primary goals of this study was to identify a gene that is gained in PDAC, is

coordinately highly expressed and is amenable to drug targeting. While no small molecules directly

targeting ECT2 currently exist, there are well-characterized small molecule inhibitors of the protein

kinase Polo-like kinase 1 (PLK1) which phosphorylates ECT2 and has been reported necessary for

its downstream cellular activity (Niiya et al., 2006; Petronczki et al., 2007; Wolfe et al., 2009). As

such, it was rational to deploy inhibitors of PLK1 to perturb the ECT2-mediated pathway (Figure

21).

Figure 21. Pharmacological modulation of ECT2-mediated oncogenesis. Through chemical inhibition of a necessary upstream regulator, PLK1, the ECT2-mediated pathway can be perturbed.

A panel of 6 cell lines was treated with two PLK1 inhibitors individually (BI-6727 and

GSK461364) in 3-fold serial dilutions. Cell viability was measured after drug treatment for 72 hours

and normalized to cell viability in DMSO-treated control cells (Figure 22). The cell line Panc04.03

with a one-copy loss of ECT2 appears to be resistant to treatment with both compounds up to a

maximal concentration (3µM). However, the distinction between drug sensitivity in cell lines with

an ECT2 gain (PATU8988S, KP4, HPAF-II, Capan-1), in comparison to the cell line AsPc1 in which

ECT2 is in the diploid state, is unclear.

62

Figure 22. Treatment of PDAC cell lines with PLK1 inhibitors. Cell viability is normalized to that measured in cells treated with only DMSO solvent. The top panel (a) depicts treatment of PDAC cell lines with BI-6727. Panel (b) depicts treatment of PDAC cell lines with GSK461364 treatment.

The one noticeable inference that can be made from these pharmacological studies is that a

cell line with a one copy loss of ECT2, Panc04.03, appears to be highly resistant to PLK1 inhibition

over the range of concentrations that other cell lines are susceptible to these compounds. Whether

or not this effect is associated with one copy loss of ECT2 requires further investigation. For

example, there may be other critical mutations present in this cell line which may contribute to its

resistance to PLK1 inhibitor compounds. These questions must be formally addressed before the

conclusion that ECT2 genomic copy number losses confer resistance to PLK1 inhibitors.

(a)

(b)

63

There are no obvious differences in susceptibility to PLK1 inhibitors in the ECT2 diploid cell

line, in comparison to the cell lines bearing ECT2 gains. In order to reveal such differential

susceptibility, if it exists, it is necessary to identify more cell lines in which ECT2 is diploid and

assess the effects of PLK1 inhibition on cell viability in these cell lines. In the present study, it is

difficult to draw conclusions from only one cell line that is diploid at the ECT2 locus. Interestingly,

the overall pattern of cell line drug sensitivity observed for BI-6727 is nearly identical to that of

GSK461364 treatment, indicating that observed susceptibility to PLK1 inhibition across the cell

lines is similar among both compounds (Figure 22). For example, Panc04.03 is the most resistant

cell line to PLK1 inhibition with both compounds, and KP4 is the most susceptible cell line to PLK1

inhibition with both compounds.

These results do however indicate that cell viability of some PDAC cell lines is substantially

diminished by PLK1 inhibitors and these compounds may be effective in PDAC treatment, however

the candidate tumor aberrations that would likely confer most benefit from PLK1 therapeutics

remain to be elucidated.

64

Chapter 3

3 Discussion

3.1 Pooling Data from Genome-Wide Analyses

This study was aimed at utilizing a rational approach to therapeutic target identification

through integrated analysis of somatic copy number gains (SCNGs), gene expression, and RNAi

analysis. Such an approach towards identifying SCNGs and over-expressed genes in cancer has been

successful, as exemplified by the earliest targeted therapies. Identification of gained and over-

expressed genes across multiple tumor samples in PDAC can therefore point to potential

therapeutic targets for further study.

While the initial target identification phase of this study was genome-wide in its design, it is

limited by factors inherent to each of the original datasets employed to identify the regions of copy

number gain in primary tumors. As outlined in Appendix Table 1, each study utilized a different

platform and associated algorithm for calling somatic copy number alterations (SCNAs). A review of

copy number alteration detection platforms underscores the inherent variability in detecting

SCNAs using different techniques or the same platform but different computational algorithm

(Pinto D et al., 2011). For this reason, pooling results from independent studies for the purpose of

creating a unified dataset is not an optimal approach to further assessing SCNAs. In addition, there

are a variety of limitations associated with each of the four datasets. The QCMG (n=3) and OICR

(n=5) datasets are small. For the QCMG dataset, the small sample size may potentially hamper

SCNAs detection to either only those which are relatively rare or highly recurrent. Furthermore, the

sample ICGC-ABMP-20090811-04-CD from the QCMG dataset has an abnormally high proportion of

gains in its genome (12.85%, while the mean across previously published datasets is ~ 1.5%). This

study employed sequencing methods that can detect SCNAs with much higher resolution in

comparison to array-based platforms, however, the algorithms applied to sequencing data are not

as well-established as those for array-based methods.

Furthermore, close inspection of two of the five samples in the OICR dataset also appear to

potentially be associated with peculiarly large gains, as >20% of the genome is gained in each of

these two samples. In general, the large size of gains in this dataset may be due to extreme noise

spanning a large genomic region, or that the 2-sided Kolmogorov-Smirnov test used to delineate

boundaries was too aggressive in merging gained regions. Another potential error may be that the

65

baseline was called too low and what is being called a gain or amplification in fact corresponds to

normal or baseline intensity. Barring exclusion of this dataset from analysis, the optimal approach

to assessing the extent to which this data is essential for the final analysis was assessed by selecting

genomic gains identified in three out of the four datasets, as previously described.

Overall, with these limitations in mind, it appears that the method of selecting regions of

genomic gain in at least three of the four studies increases the likelihood that the gains selected for

further analyses are true recurrent genomic events, and these regions have been identified as

targets of SCNGs in PDAC as well as other cancers.

3.2 Analysis of Top-Ranked Candidate Genes and Identification of

ECT2 as a Putative Target

Since there was no accompanying gene expression data from the same tumor samples

utilized in the initial phase of this analysis, expression measures were assessed and corroborated

with copy number measures in an independent panel of well-characterized human PDAC cell lines.

This was a useful approach because the same cell lines could subsequently be used to tools for

laboratory-based study of the candidate genes. In addition, 27 of the same cell lines were

characterized in a functional genomics pooled shRNA study.

The approach of correlating copy number with gene expression of genes identified as

gained or amplified in human tumors was suitable for identifying target genes that can reasonably

be studied in functional validation assays and may be potential drivers. Among the 34 top-ranked

genes, ECT2 showed the highest expression in PDAC lines and an excellent correlation between

copy number and expression. Moreover, unlike many of the other top-ranked genes, there were

numerous cell lines (n=16) which bear genomic gains at the 3q26 locus harboring ECT2, allowing

for use of multiple biological replicates to model the role of this gene and most importantly, the

extent to which genetic gains of this gene are important to the tumors that harbor them.

In the pooled shRNA screen, 7 of the 34 top-ranked genes, VCP, ECT2, RPS15, MELK, RALY,

AFG3L2, and WNK1, were determined to be essential to PDAC cell viability. Interestingly, the only

gene with a statistically significant positive correlation between copy number, expression, and

essentiality in an RNAi screen was ECT2. This gene has been identified as an oncogene, and its

genomic amplification and elevated expression have been observed in many other human cancers.

There are no known small-molecule chemical inhibitors of the ECT2 protein, however its role in

oncogenic processes and the association between ECT2 copy number gain at the 3q26 locus,

66

upregulation and essentiality, indicated that this gene may indeed be a suitable target for further

study in PDAC.

It is also worth mentioning that copy number gains at the 9p13 locus which harbors

valosin-containing protein (VCP), do not appear to impart an increase in essentiality and therefore

susceptibility to RNAi, but rather, it appears that copy number gains at this locus are associated

with decreased essentiality scores. Interestingly, VCP is the most highly essential gene among the

top-ranked candidates (mean zGARP = -5.02). It may be the case that at a basal level, VCP is a highly

essential gene, and thus effects of increased cellular gene dosage of this gene do not translate to an

enhanced proliferative capacity. Another consideration is the essentiality of VCP to normal cellular

function, in that VCP function is required for viability of normal cells. Therefore, perhaps because

this gene is uniformly essential and targeted knock-down leads to cell death, it would be difficult to

observe any positive correlation between cell viability and genomic and transcriptomic measures.

Notwithstanding this, there may potentially be clinical utility to targeting VCP in cancer and,

as in the case of proteasome inhibitors, a therapeutic window for VCP inhibitors which may hold

promise for selective therapeutic targeting of VCP in cancer. The high degree of essentiality of VCP

in pancreatic cancer cells suggests that targeting VCP may have therapeutic value and the protein

product of VCP has been shown to bind a series of novel small chemical compounds, which are

potential anticancer therapeutics (Bursavich et al., 2010).

Although only ECT2 from the 34 top-ranked genes was found to have a positive correlation

between copy number, expression, and essentiality, there may be other cellular factors that

contribute to the extent of essentiality. For example, copy number, expression and essentiality may

only be proportional until a certain threshold, whereby the association between them is no longer

direct. A gene may be essential as a result of a low copy gain, and further increases in copy number

do not translate in increased essentiality. In this instance, the copy gain may have significance to

the role of this gene in tumorigenesis, but this would be missed by our correlation-centered

approach. Similarly, RPS15, MELK, AFG3L2, RALY and WNK1 are also highly essential in PDAC and

further study of these genes may provide insight into their role in the PDAC neoplastic process.

Biological investigation of these genes may be warranted in order to obtain a clearer understanding

of the mechanisms that underlie their tumorigencity, and these genes may be the next best

candidates for functional validation.

67

3.3 Dependence on ECT2 for Cell Viability in Cell Lines Bearing a Genomic Gain at the 3q26 Locus Data on the essentiality of ECT2 to PDAC cell viability from the shRNA pooled screen by

Marcotte et al. showed a promising trilateral trend between copy number gain, expression and

essentiality. These findings prompted targeted analysis of the effects of shRNA-mediated ECT2

inhibition on cell viability, and importantly, the assessment of whether or not these effects were

associated with ECT2 copy number gains, as suggested by the shRNA pooled screen.

One of the most marked results from the targeted analyses in this study was the relatively

weak concordance of effects on cell viability observed between the top two shRNAs (fastest

dropouts), shRNA-1 (ECT2-1) and shRNA-2, in the cell lines tested. Moreover, there was minimal

concordance between effects of ECT2-1 and the other four shRNAs targeting ECT2. The ECT2-1

shRNA is consistently the fastest dropout in the pooled screen and is the shRNA with the strongest

effect on cell viability in all cell lines tested in this study. Essentiality of a gene to cell viability in the

pooled shRNA screen is determined by Gene Activity Rank Profile (GARP) scores for each gene.

Since GARP scores for genes in the pooled screen are given by the average of the shRNA Activity

Rank Profile (shARP) scores of the two fastest drop-out shRNAs, and ECT2-1 is the fastest shRNA

drop-out with the strongest effects on cell viability, it is conceivable that the GARP scores in the

pooled screen may be heavily weighted by the effect of ECT2-1. This indicates that the ECT2-1

shRNA may be driving essentiality/GARP scores observed for ECT2 in the pooled shRNA screen.

These findings prompted investigation of the known properties of the ECT2-1 shRNA

construct as indexed in The RNAi Constorium (TRC) library database in order to identify if the

effects of this shRNA are due to efficient ECT2 targeting or other confounding factors. Data on the

ECT2-1 shRNA construct from The RNAi Consortium indicate that this construct least efficiently

knocks-down ECT2, when compared to the other four ECT2 shRNAs, as measured by percent mRNA

remaining after knock-down (Appendix Table A4). Most intriguing, the TRC library reports off-

target effects for ECT2-1, but not for the other four shRNAs targeting ECT2. For the ECT2-1 shRNA,

in addition to ECT2, there are 7 other genes which are putative targets of this shRNA with a >76%

match of hairpin target sequence to the transcript RNA sequence. Among these genes that are

potential targets of ECT2-1 is PSMD1. An shRNA construct targeting PSMD1 was utilized in this

study to serve as a positive control because knock-down of this protein is known to have pan-lethal

effects on cell lines. These findings indicate that the strong effects on cell viability observed with the

ECT2-1 shRNA may not be accounted for by ECT2 knock-down alone, but also potentially as a result

68

of a combination of less-specific knock-down of ECT2 as well as other genes. The clear implication

of this analysis is that ECT2 may have been inaccurately characterized as essential to cell viability in

certain cell lines in the pooled shRNA screen as a result of observed effects of ECT2-1.

The optimal approach to addressing the problems identified in these shRNA studies is to

utilize siRNA or shRNA constructs targeting ECT2 that are both highly specific and rapidly diminish

ECT2 transcript levels over time. Once these shRNAs are obtained and validated, it will be

necessary to confirm specific knock-down of ECT2 at both the transcript and protein levels, as well

as assess the outcomes of rescuing effects on cell viability with an shRNA-resistant ECT2 clone.

Finally, it is worth mentioning that PDAC cell lines show differential infectibility in their

ability to take up shRNA-containing lentivirus. As such, careful titering must be conducted in each

cell line tested to ascertain the appropriate volume of lentivirus to be used to infect cells, such that

findings are not confounded by lack of lentivirus uptake. Through these strategies, only then would

it be possible to reliably relate effects of ECT2 knock-down on cell viability and subsequently

correlate these observations to the genomic feature of ECT2 copy number gain. Moreover, once an

effective tool is developed to facilitate ECT2 knock-down, it would be extremely valuable to not

only assess the effects of targeted ECT2 inhibition in pancreatic cancer cell lines bearing genomic

gains at the ECT2 locus, but also on normal human epithelial cells to decipher differential

dependency on ECT2 in cancer cells.

3.4 Differential Sensitivity to Inhibitors of ECT2-Mediated Cellular Pathway in Cell Lines Bearing Genomic Copy Number Gains at the 3q26 Locus

While there are currently no available small molecules to target ECT2 directly, well-

characterized inhibitors of PLK1, which phosphorylates ECT2 and is necessary for its downstream

signaling, are available. The preliminary results obtained by testing the effects of two PLK1

inhibitors on 6 PDAC cell lines indicate there may be a trend towards increasing sensitivity to PLK1

inhibition in cell lines with genetic gains of ECT2 but this remains unclear.

The cell line Panc04.03 appears uniformly resistant to both PLK1 inhibitors across the

range of concentrations that these compounds affect all other cell lines. While Panc04.03 was

characterized in this study by a one-copy loss encompassing ECT2, there may certainly be other

genomic or transcriptomic features of this cell line contributing to its resistance to PLK1 inhibition.

These must be explored before resistance to PLK1 inhibitors in this cell line can be reasonably

69

attributed to ECT2 loss. On the other hand, the cell line KP4 was the most sensitive to both PLK1

inhibitors across the entire range of concentrations tested. In addition, examination of the effects of

shRNA-mediated ECT2 interference demonstrated that this cell line appears to be overall highly

susceptible to all shRNAs tested. These findings may be as a result of generalized sensitivity of this

cell line to perturbation.

Finally, there was no clearly observed differential sensitivity to PLK1 inhibition in 4 cell

lines bearing genomic ECT2 copy number gains in comparison to the one cell line diploid at the

ECT2 locus. Notwithstanding the lack of differential activity, all of these cell lines were found to be

sensitivity to PLK1 inhibition. This indicates that PLK1 inhibitors may have a toxic effect on PDAC

cell lines, and the extent of toxicity may or may not be linked to genomic features hypothesized to

play a role, such as ECT2 copy number gains. However, more diploid cell lines must be tested and a

clear difference in PLK1 inhibitor susceptibility must be demonstrated in order for the correlation

between ECT2 copy number gains and PLK1 inhibitor susceptibility can be made.

It is important to also recognize that PLK1 inhibitors are not completely selective for PLK1

over other PLKs (PLK2, PLK3), and can thus potentially affect other cellular pathways (Appendix

Table A6). However, the compound GSK461364 has 400-fold selectivity for PLK1 over the other

PLKs. In addition, PLK1 itself is known to be involved in other cellular roles and therefore inhibition

of its activity not only affects ECT2 activity, but may affect many other proteins involved in the

PLK1 cellular signaling network (Strebhardt, 2011). Another factor to take into consideration is the

concentration range of the tested compounds. Additional concentrations of each compound may

need to be tested, as there may be an optimal therapeutic range or therapeutic window of efficacy

of PLK1 inhibitors in cell lines bearing 3q26 gains. Furthermore, the genetic status of the target of

these compounds itself, PLK1, cannot be ignored. Discovery of PLK1 mutations in human cancer

prompted the development of PLK1 inhibitors. As such, mutations of PLK1 itself would

undoubtedly impact the observed results of PLK1 inhibitors on cell viability. Therefore, PLK1

mutations, as well as gene expression, should be assessed in the tested cell lines. Finally, as with

shRNA experiments, pharmacological assays may also be performed on normal epithelial cell lines

to gauge the selectivity of PLK1 inhibition for cancer cells over benign tissue.

70

3.5 Future Directions 3.5.1 Rationale

The ultimate goal of genomic analyses in cancer is to gain insight into the biological

mechanisms which underlie cancer development and to develop improved therapeutic options for

patients, resulting in improved patient outcomes and effective treatment. Part of the challenge in

the therapeutics of cancer is that cancer is a heterogeneous disease, as exemplified by the highly

variable landscape of somatic mutation types and mutated genes across different tumor types

(Gerlinger et al., 2012). Even more striking in PDAC, as with other cancers, primary tumor lesions

show different mutation patterns than metastatic lesions in the same patient, and the primary

tumors themselves comprise multiple sub-populations of tumor cells (Campbell et al., 2010; Samuel

and Hudson, 2011). Tumor genetic and molecular heterogeneity necessitates therapy that is

tailored to target the specific genetic aberrations characteristic of the individual tumor. This

paradigm of precision medicine, or personalized medicine, is indeed the path that genomic cancer

research is headed. This study utilized a rational approach to therapeutic target identification for

pancreatic ductal adenocarcinoma. Presumably, genes that are targets of genetic gains and found to

play a role in the neoplastic process are viable putative therapeutic targets. Moreover, the genetic

gain itself represents a clinical biomarker that can be used to predict which patients may benefit

from therapy targeting tumors that bear the genetic gain.

Through analysis of somatic copy number gains in human PDAC tumor samples in public

datasets, it was possible to identify regions of genomic gain that are recurrently observed in the

PDAC tumors profiled, as well as the genes mapping to these regions. Using a panel of human PDAC

cell lines, it was then possible to identify which genes are also highly expressed in the context of

their genomic gain. This analysis, combined with assessment of how the top-ranked genes

performed in a pooled shRNA screen on the same cell lines, suggested that ECT2 might be a

promising target for further study. While ECT2 has been characterized and implicated in various

human malignancies, it is yet to be identified as a putative oncogene and therapeutic target in PDAC

and thus presents a novel target for further investigation in PDAC. It will be necessary to assess the

extent to which genetic gains of ECT2 confer growth advantage in PDACs, as well as the cellular

mechanism by which this occurs.

71

3.5.2 Specific Aims

Targeted shRNA-mediated interference experiments performed in this study are suggestive

of a potential trend between ECT2 copy number and essentiality to cell viability. However, to better

characterize this association, it is necessary to test more PDAC cell lines, as well as perhaps other

cell types such as fibroblasts to assess effects of shRNA-mediated interference. Moreover, it is

necessary to utilize shRNAs that effectively knock down ECT2 expression and this must be

validated at the mRNA and protein level.

The phenotypic endpoint of shRNA-mediated ECT2 interference in this study was cell

viability as determined by nucleic counts of cells following shRNA treatment. However, given the

role of ECT2 in normal cellular processes such as cytokinesis and cell division, presumably, ECT2

knock-down may affect migration and invasion of tumor cells as well as lead to defects in cell

division. Experiments assaying these phenotypes following ECT2 knock-down are necessary.

Finally, the cellular mechanism through which ECT2-dependent tumor progression occurs is

necessary to elucidate. ECT2 is a GEF for RhoGTPases and it would be interesting to determine

which, if any, Rho GTPases are most dependent on aberrant ECT2 activity for tumor maintenance

and progression, as this is yet to be determined.

In addition, analysis of the role of ECT2 in PDAC in vivo would be extremely valuable to

understand its role in the context of a biological system and potential interactions with the tumor

microenvironment. Pharmacologic studies in mice would also be necessary to ascertain the extent

to which drug activity in vitro models effects on tumor growth in a biological system. Also of

importance is a survey of an independent cohort of primary patient PDACs to validate the

frequency of ECT2/3q26 copy number gains and potentially correlate this genomic feature with

survival and patient outcomes.

Finally, while this study was aimed at identifying therapeutic targets for PDAC, additional

causes for the poor clinical success of traditional anticancer agents in PDAC cannot be ignored. Of

particular importance is the low tumor vascularity and poor perfusion of PDAC tumors that limits

drug delivery to the tumor site (Wang et al., 2011). These factors must be considered in early

studies aimed at molecular target development, since their clinical utility may not be realized if

other biological factors are not taken into account.

72

References Albertson, D.G. (2006). Gene amplification in cancer. Trends Genet 22, 447-455.

Barretina, J., Caponigro, G., Stransky, N., Venkatesan, K., Margolin, A.A., Kim, S., Wilson, C.J., Lehar, J., Kryukov, G.V., Sonkin, D., et al. (2012). The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603-607.

Baudis, M. (2007). Genomic imbalances in 5918 malignant epithelial tumors: an explorative meta-analysis of chromosomal CGH data. BMC Cancer 7, 226.

Beroukhim, R., Mermel, C.H., Porter, D., Wei, G., Raychaudhuri, S., Donovan, J., Barretina, J., Boehm, J.S., Dobson, J., Urashima, M., et al. (2010). The landscape of somatic copy-number alteration across human cancers. Nature 463, 899-905.

Bignell, G.R., Greenman, C.D., Davies, H., Butler, A.P., Edkins, S., Andrews, J.M., Buck, G., Chen, L., Beare, D., Latimer, C., et al. (2010). Signatures of mutation and selection in the cancer genome. Nature 463, 893-898.

Bork, P., Hofmann, K., Bucher, P., Neuwald, A.F., Altschul, S.F., and Koonin, E.V. (1997). A superfamily of conserved domains in DNA damage-responsive cell cycle checkpoint proteins. FASEB J 11, 68-76.

Brodeur, G.M., Hogarty, M.D. Gene amplification in human cancers: biological and clinical significance. In: Vogelstein, B., Kinzler, K.W., editors. The genetic basis of human cancer. New York: McGraw-Hill; 1998. p. 161-72.

Buchholz, M., Braun, M., Heidenblut, A., Kestler, H.A., Kloppel, G., Schmiegel, W., Hahn, S.A., Luttges, J., and Gress, T.M. (2005). Transcriptome analysis of microdissected pancreatic intraepithelial neoplastic lesions. Oncogene 24, 6626-6636.

Burris, H.A., 3rd, Moore, M.J., Andersen, J., Green, M.R., Rothenberg, M.L., Modiano, M.R., Cripps, M.C., Portenoy, R.K., Storniolo, A.M., Tarassoff, P., et al. (1997). Improvements in survival and clinical benefit with gemcitabine as first-line therapy for patients with advanced pancreas cancer: a randomized trial. J Clin Oncol 15, 2403-2413.

Bursavich, M.G., Parker, D.P., Willardsen, J.A., Gao, Z.H., Davis, T., Ostanin, K., Robinson, R., Peterson, A., Cimbora, D.M., Zhu, J.F., et al. (2010). 2-Anilino-4-aryl-1,3-thiazole inhibitors of valosin-containing protein (VCP or p97). Bioorg Med Chem Lett 20, 1677-1679.

Caldas, C., Hahn, S.A., da Costa, L.T., Redston, M.S., Schutte, M., Seymour, A.B., Weinstein, C.L., Hruban, R.H., Yeo, C.J., and Kern, S.E. (1994). Frequent somatic mutations and homozygous deletions of the p16 (MTS1) gene in pancreatic adenocarcinoma. Nat Genet 8, 27-32.

Calhoun, E.S., Jones, J.B., Ashfaq, R., Adsay, V., Baker, S.J., Valentine, V., Hempen, P.M., Hilgers, W., Yeo, C.J., Hruban, R.H., et al. (2003). BRAF and FBXW7 (CDC4, FBW7, AGO, SEL10) mutations in distinct subsets of pancreatic cancer: potential therapeutic targets. Am J Pathol 163, 1255-1260.

Callebaut, I., and Mornon, J.P. (1997). From BRCA1 to RAP1: a widespread BRCT module closely associated with DNA repair. FEBS Lett 400, 25-30.

73

Campbell, P.J., Yachida, S., Mudie, L.J., Stephens, P.J., Pleasance, E.D., Stebbings, L.A., Morsberger, L.A., Latimer, C., McLaren, S., Lin, M.L., et al. (2010). The patterns and dynamics of genomic instability in metastatic pancreatic cancer. Nature 467, 1109-1113.

Cheng, J.Q., Ruggeri, B., Klein, W.M., Sonoda, G., Altomare, D.A., Watson, D.K., and Testa, J.R. (1996). Amplification of AKT2 in human pancreatic cells and inhibition of AKT2 expression and tumorigenicity by antisense RNA. Proc Natl Acad Sci U S A 93, 3636-3641.

Chiang, D.Y., Getz, G., Jaffe, D.B., O'Kelly, M.J., Zhao, X., Carter, S.L., Russ, C., Nusbaum, C., Meyerson, M., and Lander, E.S. (2009). High-resolution mapping of copy-number alterations with massively parallel sequencing. Nat Methods 6, 99-103.

Chu, G.C., Kimmelman, A.C., Hezel, A.F., and DePinho, R.A. (2007). Stromal biology of pancreatic cancer. J Cell Biochem 101, 887-907.

Conroy, T., Desseigne, F., Ychou, M., Bouche, O., Guimbaud, R., Becouarn, Y., Adenis, A., Raoul, J.L., Gourgou-Bourgade, S., de la Fouchardiere, C., et al. (2011). FOLFIRINOX versus gemcitabine for metastatic pancreatic cancer. N Engl J Med 364, 1817-1825.

Dechant, R., and Glotzer, M. (2003). Centrosome separation and central spindle assembly act in redundant pathways that regulate microtubule density and trigger cleavage furrow formation. Dev Cell 4, 333-344.

Esteva, F.J., Yu, D., Hung, M.C., and Hortobagyi, G.N. (2010). Molecular predictors of response to trastuzumab and lapatinib in breast cancer. Nat Rev Clin Oncol 7, 98-107.

Fields, A.P., and Justilien, V. (2010). The guanine nucleotide exchange factor (GEF) Ect2 is an oncogene in human cancer. Adv Enzyme Regul 50, 190-200.

Fu, B., Luo, M., Lakkur, S., Lucito, R., and Iacobuzio-Donahue, C.A. (2008). Frequent genomic copy number gain and overexpression of GATA-6 in pancreatic carcinoma. Cancer Biol Ther 7, 1593-1601.

Garraway, L.A., Widlund, H.R., Rubin, M.A., Getz, G., Berger, A.J., Ramaswamy, S., Beroukhim, R., Milner, D.A., Granter, S.R., Du, J., et al. (2005). Integrative genomic analyses identify MITF as a lineage survival oncogene amplified in malignant melanoma. Nature 436, 117-122.

GeneCards (2011). GeneCards v3.

Gerlinger, M., Rowan, A.J., Horswell, S., Larkin, J., Endesfelder, D., Gronroos, E., Martinez, P., Matthews, N., Stewart, A., Tarpey, P., et al. (2012). Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. N Engl J Med 366, 883-892.

Hahn, S.A., Schutte, M., Hoque, A.T., Moskaluk, C.A., da Costa, L.T., Rozenblum, E., Weinstein, C.L., Fischer, A., Yeo, C.J., Hruban, R.H., et al. (1996). DPC4, a candidate tumor suppressor gene at human chromosome 18q21.1. Science 271, 350-353.

Harada, T., Chelala, C., Crnogorac-Jurcevic, T., and Lemoine, N.R. (2009). Genome-wide analysis of pancreatic cancer using microarray-based techniques. Pancreatology 9, 13-24.

74

Harris, T.J., and McCormick, F. (2010). The molecular pathology of cancer. Nat Rev Clin Oncol 7, 251-265.

Haverty, P.M., Hon, L.S., Kaminker, J.S., Chant, J., and Zhang, Z. (2009). High-resolution analysis of copy number alterations and associated expression changes in ovarian tumors. BMC Med Genomics 2, 21.

Heidenblad, M., Schoenmakers, E.F., Jonson, T., Gorunova, L., Veltman, J.A., van Kessel, A.G., and Hoglund, M. (2004). Genome-wide array-based comparative genomic hybridization reveals multiple amplification targets and novel homozygous deletions in pancreatic carcinoma cell lines. Cancer Res 64, 3052-3059.

Hermann, P.C., Huber, S.L., Herrler, T., Aicher, A., Ellwart, J.W., Guba, M., Bruns, C.J., and Heeschen, C. (2007). Distinct populations of cancer stem cells determine tumor growth and metastatic activity in human pancreatic cancer. Cell Stem Cell 1, 313-323.

Heselmeyer, K., Macville, M., Schrock, E., Blegen, H., Hellstrom, A.C., Shah, K., Auer, G., and Ried, T. (1997). Advanced-stage cervical carcinomas are defined by a recurrent pattern of chromosomal aberrations revealing high genetic instability and a consistent gain of chromosome arm 3q. Genes Chromosomes Cancer 19, 233-240.

Hirata, D., Yamabuki, T., Miki, D., Ito, T., Tsuchiya, E., Fujita, M., Hosokawa, M., Chayama, K., Nakamura, Y., and Daigo, Y. (2009). Involvement of epithelial cell transforming sequence-2 oncoantigen in lung and esophageal cancer progression. Clin Cancer Res 15, 256-266.

Holzmann, K., Kohlhammer, H., Schwaenen, C., Wessendorf, S., Kestler, H.A., Schwoerer, A., Rau, B., Radlwimmer, B., Dohner, H., Lichter, P., et al. (2004). Genomic DNA-chip hybridization reveals a higher incidence of genomic amplifications in pancreatic cancer than conventional comparative genomic hybridization and leads to the identification of novel candidate genes. Cancer Res 64, 4428-4433.

Hong, S.M., Li, A., Olino, K., Wolfgang, C.L., Herman, J.M., Schulick, R.D., Iacobuzio-Donahue, C., Hruban, R.H., and Goggins, M. (2011a). Loss of E-cadherin expression and outcome among patients with resectable pancreatic adenocarcinomas. Mod Pathol 24, 1237-1247.

Hong, S.M., Park, J.Y., Hruban, R.H., and Goggins, M. (2011b). Molecular signatures of pancreatic cancer. Arch Pathol Lab Med 135, 716-727.

Hong, S.P., Wen, J., Bang, S., Park, S., and Song, S.Y. (2009). CD44-positive cells are responsible for gemcitabine resistance in pancreatic cancer cells. Int J Cancer 125, 2323-2331.

Hruban, R.H., Goggins, M., Parsons, J., Kern, S.E. (2000). Progression model for pancreatic caner. 6(8), 2969-72.

ICGC (2010). International Cancer Genome Consortium. In Version 5.

Jemal, A., Siegel, R., Xu, J., and Ward, E. (2010). Cancer statistics, 2010. CA Cancer J Clin 60, 277-300.

Justilien, V., and Fields, A.P. (2009). Ect2 links the PKCiota-Par6alpha complex to Rac1 activation and cellular transformation. Oncogene 28, 3597-3607.

75

Kim, J.E., Billadeau, D.D., and Chen, J. (2005). The tandem BRCT domains of Ect2 are required for both negative and positive regulation of Ect2 in cytokinesis. J Biol Chem 280, 5733-5739.

Kitoh, H., Ryozawa, S., Harada, T., Kondoh, S., Furuya, T., Kawauchi, S., Oga, A., Okita, K., and Sasaki, K. (2005). Comparative genomic hybridization analysis for pancreatic cancer specimens obtained by endoscopic ultrasonography-guided fine-needle aspiration. J Gastroenterol 40, 511-517.

Klimstra, D.S., and Longnecker, D.S. (1994). K-ras mutations in pancreatic ductal proliferative lesions. Am J Pathol 145, 1547-1550.

Kloppel, G. (1998). Clinicopathologic view of intraductal papillary-mucinous tumor of the pancreas. Hepatogastroenterology 45, 1981-1985.

Knox, C., Law, V., Jewison, T., Liu, P., Ly, S., Frolkis, A., Pon, A., Banco, K., Mak, C., Neveu, V., et al. (2011). DrugBank 3.0: a comprehensive resource for 'omics' research on drugs. Nucleic Acids Res 39, D1035-1041.

Li, A., Liu, Z., Lezon-Geyda, K., Sarkar, S., Lannin, D., Schulz, V., Krop, I., Winer, E., Harris, L., and Tuck, D. (2011). GPHMM: an integrated hidden Markov model for identification of copy number alteration and loss of heterozygosity in complex tumor samples using whole genome SNP arrays. Nucleic Acids Res 39, 4928-4941.

Li, D., Xie, K., Wolff, R., and Abbruzzese, J.L. (2004). Pancreatic cancer. Lancet 363, 1049-1057.

Lin, L., Wang, Z., Prescott, M.S., van Dekken, H., Thomas, D.G., Giordano, T.J., Chang, A.C., Orringer, M.B., Gruber, S.B., Moran, J.V., et al. (2006). Multiple forms of genetic instability within a 2-Mb chromosomal segment of 3q26.3-q27 are associated with development of esophageal adenocarcinoma. Genes Chromosomes Cancer 45, 319-331.

Lin, S.M., Du, P., Huber, W., and Kibbe, W.A. (2008). Model-based variance-stabilizing transformation for Illumina microarray data. Nucleic Acids Res 36, e11.

Liu, X.F., Ishida, H., Raziuddin, R., and Miki, T. (2004). Nucleotide exchange factor ECT2 interacts with the polarity protein complex Par6/Par3/protein kinase Czeta (PKCzeta) and regulates PKCzeta activity. Mol Cell Biol 24, 6665-6675.

Loukopoulos, P., Shibata, T., Katoh, H., Kokubu, A., Sakamoto, M., Yamazaki, K., Kosuge, T., Kanai, Y., Hosoda, F., Imoto, I., et al. (2007). Genome-wide array-based comparative genomic hybridization analysis of pancreatic adenocarcinoma: identification of genetic indicators that predict patient outcome. Cancer Sci 98, 392-400.

Lynch, H.T., Smyrk, T., Kern, S.E., Hruban, R.H., Lightdale, C.J., Lemon, S.J., Lynch, J.F., Fusaro, L.R., Fusaro, R.M., and Ghadirian, P. (1996). Familial pancreatic cancer: a review. Semin Oncol 23, 251-275.

Maitra, A., and Hruban, R.H. (2008). Pancreatic cancer. Annu Rev Pathol 3, 157-188.

Marcotte, R., Brown, K.R., Suarez, F., Sayad, A., Karamboulas, K., Kryzankowski, P.M., Sircoulomb, F., Medrano, M., Fedyshyn, Y., Koh, J.L.Y., et al. (2012). Essential Gene Profiles in Breast, Pancreatic, and Ovarian Cancer Cells. Cancer Discovery 2, 172.

76

Meyer, S., Fergusson, W.D., Whetton, A.D., Moreira-Leite, F., Pepper, S.D., Miller, C., Saunders, E.K., White, D.J., Will, A.M., Eden, T., et al. (2007). Amplification and translocation of 3q26 with overexpression of EVI1 in Fanconi anemia-derived childhood acute myeloid leukemia with biallelic FANCD1/BRCA2 disruption. Genes Chromosomes Cancer 46, 359-372.

Miki, T., Smith, C.L., Long, J.E., Eva, A., and Fleming, T.P. (1993). Oncogene ect2 is related to regulators of small GTP-binding proteins. Nature 362, 462-465.

Moffat, J., Grueneberg, D.A., Yang, X., Kim, S.Y., Kloepfer, A.M., Hinkle, G., Piqani, B., Eisenhaure, T.M., Luo, B., Grenier, J.K., et al. (2006). A lentiviral RNAi library for human and mouse genes applied to an arrayed viral high-content screen. Cell 124, 1283-1298.

Moore, M.J., Goldstein, D., Hamm, J., Figer, A., Hecht, J.R., Gallinger, S., Au, H.J., Murawa, P., Walde, D., Wolff, R.A., et al. (2007). Erlotinib plus gemcitabine compared with gemcitabine alone in patients with advanced pancreatic cancer: a phase III trial of the National Cancer Institute of Canada Clinical Trials Group. J Clin Oncol 25, 1960-1966.

NCI/NCBI (2001). NCI and NCBIs SKY/M-FISH Database.

Neesse, A., Michl, P., Frese, K.K., Feig, C., Cook, N., Jacobetz, M.A., Lolkema, M.P., Buchholz, M., Olive, K.P., Gress, T.M., et al. (2011). Stromal biology and therapy in pancreatic cancer. Gut 60, 861-868.

Niiya, F., Tatsumoto, T., Lee, K.S., and Miki, T. (2006). Phosphorylation of the cytokinesis regulator ECT2 at G2/M phase stimulates association of the mitotic kinase Plk1 and accumulation of GTP-bound RhoA. Oncogene 25, 827-837.

Oceguera-Yanez, F., Kimura, K., Yasuda, S., Higashida, C., Kitamura, T., Hiraoka, Y., Haraguchi, T., and Narumiya, S. (2005). Ect2 and MgcRacGAP regulate the activation and function of Cdc42 in mitosis. J Cell Biol 168, 221-232.

Olive, K.P., Jacobetz, M.A., Davidson, C.J., Gopinathan, A., McIntyre, D., Honess, D., Madhu, B., Goldgraben, M.A., Caldwell, M.E., Allard, D., et al. (2009). Inhibition of Hedgehog signaling enhances delivery of chemotherapy in a mouse model of pancreatic cancer. Science 324, 1457-1461.

Olshen, A.B., Venkatraman, E.S., Lucito, R., and Wigler, M. (2004). Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5, 557-572.

Petronczki, M., Glotzer, M., Kraut, N., and Peters, J.M. (2007). Polo-like kinase 1 triggers the initiation of cytokinesis in human cells by promoting recruitment of the RhoGEF Ect2 to the central spindle. Dev Cell 12, 713-725.

Pinkel, D., and Albertson, D.G. (2005). Array comparative genomic hybridization and its applications in cancer. Nat Genet 37 Suppl, S11-17.

Pinto, D., Darvishi, K., Shi, X., Rajan, D., Rigler, D., Fitzgerald, T., Lionel, A.C., Thiruvahindrapuram, B., Macdonald, J.R., Mills, R., et al. (2011). Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants. Nat Biotechnol 29, 512-520.

77

Prokopenko, S.N., Brumby, A., O'Keefe, L., Prior, L., He, Y., Saint, R., and Bellen, H.J. (1999). A putative exchange factor for Rho1 GTPase is required for initiation of cytokinesis in Drosophila. Genes Dev 13, 2301-2314.

Rachagani, S., Senapati, S., Chakraborty, S., Ponnusamy, M.P., Kumar, S., Smith, L.M., Jain, M., and Batra, S.K. (2011). Activated KrasG(1)(2)D is associated with invasion and metastasis of pancreatic cancer cells through inhibition of E-cadherin. Br J Cancer 104, 1038-1048.

Rozenblum, E., Schutte, M., Goggins, M., Hahn, S.A., Panzer, S., Zahurak, M., Goodman, S.N., Sohn, T.A., Hruban, R.H., Yeo, C.J., et al. (1997). Tumor-suppressive pathways in pancreatic carcinoma. Cancer Res 57, 1731-1734.

Ruggeri, B., Zhang, S.Y., Caamano, J., DiRado, M., Flynn, S.D., and Klein-Szanto, A.J. (1992). Human pancreatic carcinomas and cell lines reveal frequent and multiple alterations in the p53 and Rb-1 tumor-suppressor genes. Oncogene 7, 1503-1511.

Ruggeri, B.A., Huang, L., Wood, M., Cheng, J.Q., and Testa, J.R. (1998). Amplification and overexpression of the AKT2 oncogene in a subset of human pancreatic ductal adenocarcinomas. Mol Carcinog 21, 81-86.

Russ, A.P., and Lampel, S. (2005). The druggable genome: an update. Drug Discov Today 10, 1607-1610.

Saito, S., Liu, X.F., Kamijo, K., Raziuddin, R., Tatsumoto, T., Okamoto, I., Chen, X., Lee, C.C., Lorenzi, M.V., Ohara, N., et al. (2004). Deregulation and mislocalization of the cytokinesis regulator ECT2 activate the Rho signaling pathways leading to malignant transformation. J Biol Chem 279, 7169-7179.

Saito, S., Tatsumoto, T., Lorenzi, M.V., Chedid, M., Kapoor, V., Sakata, H., Rubin, J., and Miki, T. (2003). Rho exchange factor ECT2 is induced by growth factors and regulates cytokinesis through the N-terminal cell cycle regulator-related domains. J Cell Biochem 90, 819-836.

Salhia, B., Tran, N.L., Chan, A., Wolf, A., Nakada, M., Rutka, F., Ennis, M., McDonough, W.S., Berens, M.E., Symons, M., et al. (2008). The guanine nucleotide exchange factors trio, Ect2, and Vav3 mediate the invasive behavior of glioblastoma. Am J Pathol 173, 1828-1838.

Samuel, N., and Hudson, T.J. (2011). The molecular and cellular heterogeneity of pancreatic ductal adenocarcinoma. Nat Rev Gastroenterol Hepatol 9, 77-87.

Sano, M., Genkai, N., Yajima, N., Tsuchiya, N., Homma, J., Tanaka, R., Miki, T., and Yamanaka, R. (2006). Expression level of ECT2 proto-oncogene correlates with prognosis in glioma patients. Oncol Rep 16, 1093-1098.

Schumacher, S., Gryzik, T., Tannebaum, S., and Muller, H.A. (2004). The RhoGEF Pebble is required for cell shape changes during cell migration triggered by the Drosophila FGF receptor Heartless. Development 131, 2631-2640.

Shah, A.N., Summy, J.M., Zhang, J., Park, S.I., Parikh, N.U., and Gallick, G.E. (2007). Development and characterization of gemcitabine-resistant pancreatic tumor cells. Ann Surg Oncol 14, 3629-3637.

78

Sophic (2012). Sophic Druggable Genome, S.S.A. Inc., ed.

Stratton, M.R. (2011). Exploring the genomes of cancer cells: progress and promise. Science 331, 1553-1558.

Strebhardt, K. (2011). Multifaceted polo-like kinases: drug targets and antitargets for cancer therapy. Nat Rev Drug Discov 9, 643-660.

Sun, Y. (2006). E3 ubiquitin ligases as cancer targets and biomarkers. Neoplasia 8, 645-654.

Suzuki, A., Shibata, T., Shimada, Y., Murakami, Y., Horii, A., Shiratori, K., Hirohashi, S., Inazawa, J., and Imoto, I. (2008). Identification of SMURF1 as a possible target for 7q21.3-22.1 amplification detected in a pancreatic cancer cell line by in-house array-based comparative genomic hybridization. Cancer Sci 99, 986-994.

Tatsumoto, T., Sakata, H., Dasso, M., and Miki, T. (2003). Potential roles of the nucleotide exchange factor ECT2 and Cdc42 GTPase in spindle assembly in Xenopus egg cell-free extracts. J Cell Biochem 90, 892-900.

Tatsumoto, T., Xie, X., Blumenthal, R., Okamoto, I., and Miki, T. (1999). Human ECT2 is an exchange factor for Rho GTPases, phosphorylated in G2/M phases, and involved in cytokinesis. J Cell Biol 147, 921-928.

Thompson, L.H., Brookman, K.W., Jones, N.J., Allen, S.A., and Carrano, A.V. (1990). Molecular cloning of the human XRCC1 gene, which corrects defective DNA strand break repair and sister chromatid exchange. Mol Cell Biol 10, 6160-6171.

Uhlen, M., Oksvold, P., Fagerberg, L., Lundberg, E., Jonasson, K., Forsberg, M., Zwahlen, M., Kampf, C., Wester, K., Hober, S., et al. (2010). Towards a knowledge-based Human Protein Atlas. Nat Biotechnol 28, 1248-1250.

Villarroel, M.C., Rajeshkumar, N.V., Garrido-Laguna, I., De Jesus-Acosta, A., Jones, S., Maitra, A., Hruban, R.H., Eshleman, J.R., Klein, A., Laheru, D., et al. (2011). Personalizing cancer treatment in the age of global genomic analyses: PALB2 gene mutations and the response to DNA damaging agents in pancreatic cancer. Mol Cancer Ther 10, 3-8.

Vogelstein, B., and Kinzler, K.W. (2004). Cancer genes and the pathways they control. Nat Med 10, 789-799.

Volik, S., Zhao, S., Chin, K., Brebner, J.H., Herndon, D.R., Tao, Q., Kowbel, D., Huang, G., Lapuk, A., Kuo, W.L., et al. (2003). End-sequence profiling: sequence-based analysis of aberrant genomes. Proc Natl Acad Sci U S A 100, 7696-7701.

Wallrapp, C., Muller-Pillasch, F., Solinas-Toldo, S., Lichter, P., Friess, H., Buchler, M., Fink, T., Adler, G., and Gress, T.M. (1997). Characterization of a high copy number amplification at 6q24 in pancreatic cancer identifies c-myb as a candidate oncogene. Cancer Res 57, 3135-3139.

Wang, Z., Li, Y., Ahmad, A., Banerjee, S., Azmi, A.S., Kong, D., and Sarkar, F.H. (2011). Pancreatic cancer: understanding and overcoming chemoresistance. Nat Rev Gastroenterol Hepatol 8, 27-33.

79

Wang, Z., Li, Y., Kong, D., Banerjee, S., Ahmad, A., Azmi, A.S., Ali, S., Abbruzzese, J.L., Gallick, G.E., and Sarkar, F.H. (2009). Acquisition of epithelial-mesenchymal transition phenotype of gemcitabine-resistant pancreatic cancer cells is linked with activation of the notch signaling pathway. Cancer Res 69, 2400-2407.

Wolfe, B.A., Takaki, T., Petronczki, M., and Glotzer, M. (2009). Polo-like kinase 1 directs assembly of the HsCyk-4 RhoGAP/Ect2 RhoGEF complex to initiate cleavage furrow formation. PLoS Biol 7, e1000110.

Yang, Y.L., Chu, J.Y., Luo, M.L., Wu, Y.P., Zhang, Y., Feng, Y.B., Shi, Z.Z., Xu, X., Han, Y.L., Cai, Y., et al. (2008). Amplification of PRKCI, located in 3q26, is associated with lymph node metastasis in esophageal squamous cell carcinoma. Genes Chromosomes Cancer 47, 127-136.

Yen, C.C., Chen, Y.J., Pan, C.C., Lu, K.H., Chen, P.C., Hsia, J.Y., Chen, J.T., Wu, Y.C., Hsu, W.H., Wang, L.S., et al. (2005). Copy number changes of target genes in chromosome 3q25.3-qter of esophageal squamous cell carcinoma: TP63 is amplified in early carcinogenesis but down-regulated as disease progressed. World J Gastroenterol 11, 1267-1272.

Yeo, T.P., Hruban, R.H., Leach, S.D., Wilentz, R.E., Sohn, T.A., Kern, S.E., Iacobuzio-Donahue, C.A., Maitra, A., Goggins, M., Canto, M.I., et al. (2002). Pancreatic cancer. Curr Probl Cancer 26, 176-275.

Yildirim, M.A., Goh, K.I., Cusick, M.E., Barabasi, A.L., and Vidal, M. (2007). Drug-target network. Nat Biotechnol 25, 1119-1126.

Zender, L., Spector, M.S., Xue, W., Flemming, P., Cordon-Cardo, C., Silke, J., Fan, S.T., Luk, J.M., Wigler, M., Hannon, G.J., et al. (2006). Identification and validation of oncogenes in liver cancer using an integrative oncogenomic approach. Cell 125, 1253-1267.

Zhang, M.L., Lu, S., Zhou, L., and Zheng, S.S. (2008). Correlation between ECT2 gene expression and methylation change of ECT2 promoter region in pancreatic cancer. Hepatobiliary Pancreat Dis Int 7, 533-538.

Zhong, Y., Wang, Z., Fu, B., Pan, F., Yachida, S., Dhara, M., Albesiano, E., Li, L., Naito, Y., Vilardell, F., et al. (2011). GATA6 activates Wnt signaling in pancreatic cancer by negatively regulating the Wnt antagonist Dickkopf-1. PLoS ONE 6, e22129.

80

Appendices

81

Citation Tissue/Sample Type

Number of Samples Studied

Summary of Chromosomal

Regions Amplified

Genes Affected

Additional Notes

Mahlamaki EH, et al. Frequent amplification of 8q24, 11q, 17q and 20q-specific genes in pancreatic cancer. Genes, Chromosome & Cancer, 2002. 35:353-8.

Pancreatic cancer cell lines

31 See Table 1 Among the most frequently gained: 8q, 11q, 17q, 20q

MYC, CCND1, ERBB2, TBX2, BIRC5, BCL2L1, NCOA6, NCOA3, MYBL2, PTPN1, ZNF217, ck20.10e9 STK15 CTSZ

CGH and FISH to identify recurrent genetic changes in pancreatic cancer cell lines. Evaluated copy number changes of selected genes from the four regions by interphase FISH in 30 pancreatic cell lines.

Heidenblad M, et al. Genome-wide array-based comparative genomic hybridization reveals multiple amplification targets and novel homozygous deletions in pancreatic carcinoma cell lines. Cancer Research, 2004. 64:3052-9.

Pancreatic carcinoma cell lines

31 See Table 1 60 amplicons at 32 different locations: 8q (8 cases) 12p (7 cases) 7q (5 cases) 18q (5 cases) 19q (5 cases) 6p (4 cases) 8p (4 cases) Regions most frequently involved in amplifications: 6p21-22, 7q21-31, 8p11-12, 8q23-24, 12p11-12, 18q11-12, 19q13.2

DAD-R, SOX5, EK11 CCDN3 HGFR (MET) AKT2

FISH-verified BAC clones and cDNA clones. Amplicons ranged from 0.4-38.1 Mb, average 8.4Mb, median 4.5Mb 18q amplifications were close to/at a deletion breakpoint

Mahlamaki EH, et al. High resolution genomic and expression profiling reveals 105 putative amplification target genes in pancreatic cancer. Neoplasia, 2004. 6(5): 432-439.


13 See Table 1 – 24 independent amplicons

105 genes (Table 2) PAK4

CGH on cDNA microarray to identify gene expression change events that were associated with gene copy number alterations. (Varying clone densities; average resolution of 300kb throughout the genome) Amplicons ranged in size from 13kb to 11Mb

Gysin S, et al. Analysis of genomic DNA alterations and mRNA expression patterns in a panel of human pancreatic cancer cell lines. Genes Chromosomes & Cancer, 2005. 44(1):37-51.

Pancreatic cell lines (derived from metastatic/primary tumor)

25 See Table 3 BASP1, EBF, TNF, MRSA, MYC, CCND1, BIRC3, TRIM29, KRAS, LOC81558, AKT2, VRK2, NCOA3

Table A1. Focal somatic copy number gains in pancreatic ductal adenocarcinoma in the literature.

82

Nowak NJ, et al. Genome-wide aberrations in pancreatic adenocarcinoma. Cancer Genetics and Cytogenetics, 2005. 161:36-50.

17 first passage xenografts and 16 cell lines

33 Recurrent gains: 7p21.1-p11.2, 7q21.32, 7q33, 8q1.1-q24, 11p13, 14q22.2, 20-12.2, 20q11.23-q13.3

See Table 2 and Table 3

Distinguish differences between cell line and xenograft aberration profiles.

Heidenblad M, et al. Microarray analyses reveal strong influence of DNA copy number alterations on the transcriptional patterns in pancreatic cancer: implications for the interpretation of genomic amplifications. Oncogene, 2005. 24:1794-1801.

Pancreatic carcinoma cell lines

29 67 recurrently over-expressed genes located in 7 precisely mapped commonly amplified regions. Two most frequently amplified regions in pancreatic cancer: 8q23-24, 12p11-12

FLJ12760, TLK2

Expression profiling analysis using cDNA microarrays and corroborating with genomic profiling data (Heidenblad, 2004). More than one putative target may be of importance in pancreatic cancer amplicons.

Harada T, et al. Identification of genetic alterations in pancreatic cancer by the combined use of tissue microdissection and array-based comparative genomic hybridization. British Journal of Cancer, 2007. 96:373-382.

Microdissected PDAC tissue samples, consisting of purified populations of cancer cells

23 7p and 18q IQCE, TRIAD3, PMS2, EIF2AK1, PSCD3, EIF2AK1,RAC1,PSCD3, CIGALT1, GLCCI1, ICA1, ETC1, DGKB, SNX13, 7A5, DNAH11, STK31, tcag7.981, CREB5

Harada T, et al. Genome-wide DNA copy number analysis in pancreatic cancer using high-density single nucleotide polymorphism arrays. Oncogene, 2008. 27:1951-1960.

Micro-dissected PDAC specimens

27 Frequent gains (>78% of cases) 1q, 2, 3, 5, 7p, 8q, 11, 14q, 17q

SKAP2/ SCAP2 gene (7p15.2) – most frequently amplified (63%) See Table 1 for list of “frequent gene CNs”

High-density microarrays representing 116 000 SNP loci

Kikuchi S, et al. Expression and gene amplification of actinin-4 in invasive ductal carcinoma of the pancreas. Clinical Cancer Research, 2008. 14(17):5348-56.

Tissue samples from invasive pancreatic ductal adenocarcinoma

173 19q13.1-2 Amplification of the ACTN4 gene was detected in 11/29 cases showing increased expression

ACTN4 CN was calculated by FISH, expression was knocked-down by shRNA, tumorigenicty was evaluated by orthotopic implanation into SCID mice.

83

Suzuki A, et al. Identification of SMURF1 as a possible target for 7q21.3-22.1 amplification detected in a pancreatic cancer cell line by in-house array-based comparative genomic hybridization. Cancer Science, 2008. 99(5):986-994.


24 7q21.3-22.1 SMURF1 May work as a growth-promoting gene and a good therapeutic target

Fu B, et al. Frequent genomic copy number gain and over-expression of GATA-6 in pancreatic carcinoma. Cancer Biology Therapy, 2008. 7(10):1593-601.

Pancreatic cancer xenografts

42 18q11.2 GATA-6, cTAGE1

Representational Oligonucleotide Microarray Analysis (ROMA), and validating using FISH, qPCR, Western and immunohistochemical staining.

Lin LJ, et al. Integrated analysis of copy number alterations and loss of heterozygosity in human pancreatic cancer using a high-resolution, single nucleotide polymorphism array. Oncology, 2008. 75(1-2):102-12.

Pancreatic cancer cell lines/micro-dissected tissue specimens

25/14 23 amplified regions in at least 2 cell lines, including 8 unreported loci. See Table 2 for amplifications in cell lines and Table 3 for amplifications in patient samples. Size of minimal common amplification was at 13q22.2 Most frequently amplified region: 12p12.1-12p11.23 3q25.1, 5p15.2, 8q24.21, 11q14.1-2, 11q22.1-3, 14q11.2, 19q13.2

ARID4B Genes at newly identified loci: ARID4B, COL4A3, COLA4, WWTR1, TRIP, DNAH5, TNKS2, MAML2, TBC1D4, RAB8A

Screened genome-wide copy number alterations and LOH simultaneously in pancreatic cancer cell lines using an SNP array and validated the amplifications and LOH in primary pancreatic cancer tissue.

Chen S, et al. Copy number alterations in pancreatic cancer identify recurrent PAK4 amplification. Cancer Biology Therapy, 2008. 7(11):1793-802.

Pancreatic adenocarcinomas

72 19q13 PAK4 Mechanism relies on KRAS2 activation/genomic amplification to activate PAK4. Complete data set publically available.

Harada T, et al. Genome-wide analysis of pancreatic cancer using microarray-based techniques. Pancreatology, 2009. 9(1-2):13-24.

Cell lines/micro-dissected tissue specimens

6/23 See Table 2 See Table 2 1-Mb-spaced CGH arrays, then assessed transcript levels in regions of genetic alterations using Pancreatic Expression Database.

84

Laurila E, et al. Characterization of the 7q21-q22 amplicon identified APRC1A, a subunit of the Arp2/3 complex, as a regulator of cell migration and invasion in pancreatic cancer. Genes Chromosomes Cancer, 2009. 48(4):330-9.

Pancreatic cancer cell lines/primary pancreatic tumors

16/29 7q21-q22 APRC1A – had most statistically significant correlation between ampli-fication and elevated expression.

FISH

Kuuselo R, et al. 19q13 amplification is associated with high grade and stage in pancreatic cancer. Genes Chromosomes Cancer, 2010. 49(6):569-75.

Primary pancreatic tumors Metastases Local recurrences Cancer cell lines from various tissues

357 151 24 120

19q13 19q13 amplification associated with poor tumor phenotype and shorter survival.

Campbell PJ, et al. The patterns and dynamics of genomic instability in metastatic pancreatic cancer. Nature, 2010. 467:1109-1113.

Early passage cell lines from resected primary patient tumors Multiple metastases collected at autopsy

3 10

Supplementary Table 2

85

Table A2. Public pancreatic cancer genome datasets utilized in copy number gain analysis. QCMG ICGC OICR ICGC JHU Pancreatic Cancer

Genome Project Harada et al, 2009.

Reference December 6, 2010 Data Coordination Center Release (ICGC 3)

July 7, 2011 Data Coordination Center Release (ICGC 6)

Jones et al. Core signaling pathways in human pancreatic cancers revealed by global genomic analyses. Science, 2008; 321(5897):1801-6.

Harada et al. Genome-wide analysis of pancreatic cancer using microarray-based techniques. Pancreatology, 2009; 9:13-24.

Number of Samples

3 5 24 29

Sample Type(s)

2 primary tumors 1 cell line

5 primary tumors 14 cell lines 10 xenografts (derived from primary tumors and multiple metastases collected at autopsy)

23 microdissected tumor specimens 6 cell lines

Platform/ Technology

SOLiD Sequencing Nimblegen Human CGH 2.1M Whole-genome v2.0D Array

Illumina Infinium II Whole Genome Genotyping Assay (Beadchip Platform, 1M SNP loci)

BAC/PAC Whole Genome CGH array (Sanger Institute)

Genome Build

Hg19/NCBI37, February 2009

Hg19/NCBI37, February 2009 SNP positions based on hg18/NCBI36, March 2006

Hg18/NCBI36, March 2006

Number of genes (UCSC) encompassed in gains in at least one sample*

10 229 11 958 228*

701†

Detailed Sample Information

ICGC-ABMP-20090811-04-CD is Panc05.04 (CRL2557), which is the same cell line as Pa18C in the JHU dataset. APGI-1959: ICGC-ABMP-20091020-01-TD (primary PDAC, AU sample) APGI-1992: ICGC-ABMP-20091203-06-TD (primary PDAC, AU sample)

5 primary pancreatic tumors obtained from Mayo Clinic, USA.

24 samples included 14 cell lines and 10 xenografts derived from 17 patients with surgically resected carcinomas and 7 patients who underwent rapid autopsy; 22 of the carcinomas were primary PDACs and 2 were infiltrating adenocarcinomas centered on the intrapancreatic bile duct; 9 of the cancers have already metastasized and were late-stage (Stages IIb or IV); 3 of the cell lines analyzed are available through ATCC (Pa14C is Panc08.13, Pa16C is Panc 10.05, and Pa18C is Panc05.04).

6 cell lines were acquired from Cancer Research UK Cell services; 23 fresh-frozen PDAC tissue specimens were manually microdissected, collected >90% purity of tumor cells.

Copy Number Analysis Method

CNV analysis of paired end and/or LMP from SOLiD CNV-Seq was originally used (http://www.biomedcentral.com/1471-2105/10/80). In addition, a tool developed in-house, QCOPY, has also been used and compared to CNV-Seq. This tool is similar to CNV-Seq, but normalizes better for mappability and GC content.

The CNV analysis was conducted on the Nimblegen platform. The basic pipeline involves: first normalization of LogR of Cy-3 to Cy-5, minimization of variance of quantity in a moving window across the 2.1e-6 linear probe-space, finding local minima and recording their boundaries. These boundaries are then tested by conducting a 2-sided Kolmogorov-Smirnov(KS) test on the null hypothesis that the two regions separated by the putative boundary are the same, so that if the KS test yields a high p-value, the boundary can be accepted as genuine. This is performed recursively with respect to some empirical threshold found through previous experiments.

Fluorscence intensity image files were processed using Illumina BeadStation software to provide normalized intensity value (R) for each SNP position. For each SNP, the normalized experimental intensity value (R) was compared to the intensity values for that SNP from a training set of normal samples and represented as a ratio (logR ratio) of log2(Rexperimental/Rtraining

set). Amplifications were defined by regions containing ≥3 SNPs with an average LogR ratio ≥ 0.9 with at least one SNP having a LogR ratio ≥1.4. All putative amplifications with identical boundaries in multiple samples were excluded. As focal amplifications are more likely to

‘aCGH-Smooth’ was used to detect DNA copy number alterations in each tumor sample. This software performs the data smoothing and breakpoint recognition using a local search algorithm. Based on preliminary data from seven normal vs. normal DNA hybridisations, the threshold for genetic ‘gains’ was determined as a smoothed log2 ratio ≥0.214, corresponding to ±2 standard deviations. High-level amplifications were defined as a log2 ratio ≥0.75, corresponding to a theoretical ±3.5-fold of the threshold for low-level alterations. All identified regions of

http://www.biomedcentral.com/1471-2105/10/80



86

The boundaries are then collated to obtain the initial ‘short’ table of contiguous segments. In order to impute CNV state and remove the baseline diploid signal, estimation of the underlying statistical parameters of the modes of the above log-signal is performed (the sum of normal distributions by performing a type of Bayesian analysis which maximizes the expectation value of the likelihood that the data are represented by a particular sum of normal modes. Assuming three modes (amplification, deletion and baseline), the mean and standard deviation for each are calculated. Using these parameters, the original segments are then assessed to determine which fall into to the amplifications/deletions category and so on, providing the CNVs. Note, given the binary nature of the Nimblegen platform, a particular run against a normal sample will automatically yield somatic CNVs.

be useful in identifying specific target genes, a second set of criteria were used to involve complex amplifications, large chromosomal regions or entire chromosomes that showed copy number gains (therefore, extensive criterial filtering was done).

alterations were verified by assessing the raw normalized data. The raw normalized (‘non-smoothed’) CGH data were analyses using the MSA coftware to identify minimal common regions (MCRs) or non-random genetic alterations with a statistical significance. ‘Non-random’ alterations were defined as genetic changes which were commonly identified in at least 14/29 PDACs (≥48 samples).

QCMG: Queensland Center for Medical Genomics; ICGC: International Cancer Genome Consortium; OICR: Ontario Institute for Cancer Research; JHU: Johns Hopkins University; NG: NimbleGen; BAC/PAC: bacterial artificial chromosome/P1-derived artificial chromosome. *JHU study reported only high-level amplifications (excluded low-level gains). † Harada et al, 2009 contained genes encompassed in gains in at least 14 samples.

87

Table A3. Cell Lines utilized in integrated analysis in this study. Tumor type ATCC Cell line Sex Patient Age

when tumor excised

Derived from

Pancreatic CRL-1682 AsPC-1 Female 62 Met (ascites)

Pancreatic CRL-1687 BxPC-3 Female 61 Primary, CFTR (-)

Pancreatic HTB-79 Capan-1 Female 61 Primary, CFTR (+)

Pancreatic HTB-80 Capan-2 Female 56 Primary

Pancreatic CRL-1918 CFPAC-1 Male 26 Met (liver) Patient had cystic fibrosis

Pancreatic CRL-2119 HPAC Male 64 primary

Pancreatic CRL-1997 HPAF-II Male 44 Met (ascites)

Pancreatic HTB-134 Hs 766T Male 46 Met (lymph node)

Pancreatic IMIM-PC-1 Primary

Pancreatic IMIM-PC-2 Primary

Pancreatic KP2

Pancreatic KP-3 (JCRB0178.0 ) Met (liver)

Pancreatic KP-4 (JCRB0182) Male 50 Met (ascites)

Pancreatic CRL-1420 MIA PaCa-2 Male 65 Primary

Pancreatic CRL-2553 Panc 02.03 Female 70 Primary


Pancreatic CRL-2555 Panc 04.03 Male 70 Primary


Pancreatic CRL-2551 Panc 08.13 Male 85 Primary

Pancreatic CRL-2547 Panc 10.05 Male Unknown primary (same patient as PL45)

Pancreatic CRL-1469 PANC-1 Primary

Pancreatic PATU8988S Met (liver)

Pancreatic PATU8988T Met (liver)

Pancreatic CRL-2558 PL45 Male Unknown Primary (same patient as Panc 10.05)

Pancreatic RWP1 Met (liver)

Pancreatic SK-PC-1 primary tumor

Pancreatic SK-PC-3

Pancreatic CRL-1837 SU.86.86 Female 57 Met (liver)

Pancreatic CRL-2172 SW 1990 Met (spleen)

88

Table A4. The RNAi Consortium (TRC) shRNA Constructs Input Clone ID Clone Name Target

Taxon Target Gene

Target Gene

Symbol

%KD: mRNA expression remaining

ECT2-1 TRCN0000047683 NM_018098.4-2538s1c1 Human 1894 ECT2 31





Table A5. Puromycin concentrations used in shRNA experiments

Cell Line Puromycin Concentration

(µg/mL)

AsPc1 2

Capan-1 2

Capan-2 3

KP4 2

HPAF-II 2

Panc03.27 2

Panc04.03 2.5

Panc08.13 2

MIA PaCa-2 2

PATU8988S 3.5

Table A6. Details of PLK1 compounds utilized in pharmacologic assay. BI-6727 (Volasertib) GSK461364

Company Boehringer Ingelheim Glaxo SmithKlein

Mechanism Selective and ATP-competitive inhibitor of PLK proteins

Selective and ATP-competitive inhibitor of PLK1

Highest Dev Status Phase 2 Clinical Current clinical trial

PLK1 IC50 0.87nM 2.2nM

Selectivity PLK2 (5nM), PLK3 (56nM) 400-fold greater potency for PLK1 than PLK2/3

Chemical Structure

Documents

Analysis of Somatic Copy Number Gains in Pancreatic Ductal … · 2013. 10. 18. · Rima Al-Awar, Quang Trinh and Lakshmi Muthuswamy for helpful discussions and feedback. I am also