PROTEOMIC ANALYSIS OF A DYNAMIC SALMONELLA- · PDF filePROTEOMIC ANALYSIS OF A DYNAMIC SALMONELLA- ... (PCP-SILAC) is a recent ... a fluorescent assay was used to measure protein synthesis

PROTEOMIC ANALYSIS OF A DYNAMIC SALMONELLA-

HOST INTERACTOME

by

Lyda Mimi Brown

B.Sc., California State University, Chico, 2010

A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF

THE REQUIREMENTS FOR THE DEGREE OF

MASTER OF SCIENCE

in

THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES

(Genome Science and Technology)

THE UNIVERSITY OF BRITISH COLUMBIA

(Vancouver)

December 2013

© Lyda Mimi Brown, 2013

ii

Abstract

Salmonella is capable of evading the host immune response by secreting

virulence factors (effectors) that enter the host and interfere with critical cell

signaling networks. Although many of the individual effector proteins have been well

characterized by traditional biochemical methods, a shift towards global strategies

would offer a system’s view of the intricate network of interactions (interactome)

between the host and pathogen.

Protein profiling across size exclusion chromatography with stable isotope

labeled amino acids in cell culture (PCP-SILAC) is a recent proteomic method

developed to characterize the composition of protein complexes with the advantages

of being quantitative, capable of monitoring dynamic changes, and being completely

tag and chemical cross-link free. This method was applied to study the dynamic

changes of host protein complexes as a result of Salmonella infection.

Analysis of dynamic PCP-SILAC proteins led to the hypothesis that host

translational machinery was being targeted by Salmonella during infection. To test

this hypothesis, a fluorescent assay was used to measure protein synthesis of cells

infected with WT Salmonella versus control (non-infected cells). Results from the

protein synthesis analysis showed a decrease in host translation, supporting our

hypothesis that Salmonella targets host translational machinery. I addition to this

novel finding, this work provides a rich resource of candidate host proteins that may

be involved with Salmonella’s pathogenesis.

iii

Preface

Research experiments involving Salmonella have been approved by Biosafety

Certificate # B12-0088

iv

Table of Contents

Abstract............................................................................................. ii!

Preface .............................................................................................iii!

Table of Contents................................................................................. iv!

List of Tables .................................................................................... viii!

List of Figures .....................................................................................ix!

List of Abbreviations ..............................................................................x!

Acknowledgements............................................................................. xiii!

Dedication........................................................................................ xiv!

Chapter 1: Introduction ......................................................................... 1!

1.1! Mapping the interactome ................................................................2!

1.1.1! Traditional mapping tools ..........................................................3!

1.2! Salmonella enterica ......................................................................4!

1.2.1! Classification .........................................................................4!

1.2.2! Human health.........................................................................5!

1.2.3! Pathogenesis ..........................................................................6!

1.2.4! Proteomics and the host pathogen frontier......................................8!

1.3! Quantitative mass spectrometry based proteomics..................................8!

1.3.1! The proteome.........................................................................8!

1.3.2! Instrumentation ......................................................................9!

1.3.3! Protein identification.............................................................. 10!

v

1.3.4! Quantitative techniques .......................................................... 12!

1.4! Proteomic techniques for analysis of protein complexes ......................... 13!

1.4.1! Tandem affinity purification (TAP).............................................. 15!

1.4.2! High-throughput protein complex profiling .................................... 17!

1.4.3! Chemical cross-linking in living systems........................................ 19!

1.4.4! Validation ........................................................................... 21!

Chapter 2: Materials and methods ...........................................................23!

2.1! General solutions and buffers ......................................................... 23!

2.2! Cell culture............................................................................... 23!

2.2.1! SILAC labeling....................................................................... 24!

2.3! Sample preparation ..................................................................... 24!

2.3.1! Salmonella infection............................................................... 24!

2.3.2! Cell harvesting...................................................................... 25!

2.3.3! Size exclusion chromatography of protein complexes ....................... 25!

2.3.4! In solution digestion of protein complexes .................................... 26!

2.3.5! StageTip purification .............................................................. 26!

2.4! Mass spectrometric methods .......................................................... 27!

2.4.1! LC-MS/MS ........................................................................... 27!

2.5! Bioinformatics and statistical analysis ............................................... 27!

2.5.1! Database searching and quantitation ........................................... 27!

2.5.2! Visualizing protein chromatograms with R .................................... 28!

2.5.3! Filtering protein chromatograms ................................................ 29!

2.5.4! Clustering protein chromatograms .............................................. 29!

vi

2.5.5! Protein complex enrichment analysis........................................... 29!

2.5.6! Differential protein changes ..................................................... 30!

2.5.7! GO analysis of changing proteins ................................................ 31!

2.6! S. Typhimurium strains................................................................. 31!

2.6.1! RFP gene transfection of S. Typhimurium...................................... 32!

2.6.2! Salmonella protein synthesis assay.............................................. 32!

Chapter 3: Results...............................................................................33!

3.1! Mapping a global interactome landscape with PCP-SILAC ........................ 35!

3.1.1! Applying PCP-SILAC to a host-pathogen system ............................... 39!

3.2! Identification of host proteins targeted during Salmonella invasion............ 40!

3.2.1! Preprocessing of Replicate Data ................................................. 41!

3.2.2! Identifying dynamic proteins (log fold change, z-test, t-test) ............. 42!

3.2.3! Integrating a phosphoproteomic Salmonella study ........................... 47!

3.2.4! GO enrichment analysis ........................................................... 48!

3.3! Host protein synthesis machinery targeted during Salmonella invasion........ 49!

Chapter 4: Discussion...........................................................................53!

4.1! Current Salmonella-host interactome ............................................... 53!

4.2! The identified dynamic host proteins during Salmonella invasion .............. 53!

4.2.1! PTRF/Cavin-1 ....................................................................... 54!

4.2.2! HSP27 ................................................................................ 55!

4.2.3! Cyclase Associated Protein-1..................................................... 57!

4.2.4! C-t-PAK2 ............................................................................. 58!

4.3! Host translation during Salmonella invasion ........................................ 58!

vii

4.3.1! Protein synthesis during Salmonella invasion.................................. 59!

4.3.2! HSP27 inhibits translation ........................................................ 60!

Chapter 5: Conclusion ..........................................................................62!

Bibliography ......................................................................................63!

Appendices........................................................................................72!

Appendix A List of Protein Complexes Enriched in PCP-SILAC dataset ................ 72!

Appendix B List of Dynamic Proteins in PCP-SILAC Salmonella Dataset Identified by

Z-test ............................................................................................ 85!

viii

List of Tables

Table 2.1 Salmonella strains used in this thesis ............................................ 31!

Table 3.1 Identification of dynamic proteins after Salmonella infection (Log FC ) ... 43!

Table 3.2 Identification of dynamic proteins after Salmonella infection (Z-Test) .... 44!

Table 3.3 Identification of dynamic proteins after Salmonella infection (T-Test) .... 45!

Table 3.4 Protein Synthesis of Salmonella strains tested .................................. 51!

Table 5.1 Protein Complex Enrichment Analysis Results .................................. 72!

Table 5.2 Identification of dynamic proteins after Salmonella infection (Z-Test) .... 85!

ix

List of Figures

Figure 1.1 Modular organization in origami and protein interactions ......................2!

Figure 1.2 Salmonella invasion ...................................................................6!

Figure 1.3 MS-based shotgun proteomics ..................................................... 12!

Figure 1.4 Mass spectrometry approaches for analyzing protein complexes............ 15!

Figure 3.1 Experimental workflow for PCP SILAC with Salmonella infection. .......... 33!

Figure 3.2 PCP-SILAC data analysis workflow ................................................ 34!

Figure 3.3 PCP-SILAC landscape interactome ................................................ 35!

Figure 3.4 Individual protein profiles ......................................................... 36!

Figure 3.5 Filtering protein profiles ........................................................... 37!

Figure 3.6 Heatmap of hierarchical clustered protein profiles of one replicate. ...... 38!

Figure 3.7 Aligning protein chromatograms with proteasome subunits .................. 41!

Figure 3.8 Heatmap of changing proteins identified by the z-test ....................... 42!

Figure 3.9 HSP27 PCP-SILAC profile ........................................................... 46!

Figure 3.10 Phosphoproteomic and PCP-SILAC venn diagram ............................. 47!

Figure 3.11 GO functional analysis of dynamic PCP-SILAC proteins with infection .... 48!

Figure 3.12 AHA protein synthesis assay ...................................................... 50!

Figure 3.13 Protein synthesis in HeLa cells infected with WT Salmonella and effector

knockout strains .................................................................................. 52!

x

List of Abbreviations

2DE Two-dimensional gel electrophoresis

ABC Ammonium bicarbonate

ABP Actin binding protein

AHA L-azidohomoalanine

ARAF Serine/threonine-protein kinase A-Raf

ARF ADP-ribosylating factor

CAP1 Adenylyl cyclase associated protein

CDC42 Cell division control protein 42 homolog

CID Collision-induced dissociation

Da Dalton

DNA Deoxyribonucleic Acid

DMEM Dulbecco’s Modified Eagle Medium

dFBS Dialyzed fetal bovine serum

emPAI Exponentially modified Protein Abundance Index

ESI Electrospray ionization

FPR False positive rates

g Gravity

GAP GTPase-activating protein

GEF Guanine nucleotide exchange factor

GO Gene Ontology

xi

HCD Higher-energy collisional dissociation

H/L Heavy/Light Ratio of proteins

HSP Heat shock protein

IL Interleukin

IPI International Protein Index

kDa Kilodalton

LB Luria’s Broth

LC-MS Liquid chromatography mass spectrometry

LTQ Linear trap quadrupole

MALDI Matrix-assisted laser/desorption ionization

MAPK Mitogen-activated protein kinase

MAPKK MAPK kinase

M cell Microfold cell

mRNA Messenger ribonucleic acid

M/L Medium to light ratio of proteins

MOI Multiplicity of infection

M/Z Mass-to-charge ratio

PAI Protein abundance indices

PAK P21-activated kinases

PAMP Pathogen associated molecular patterns

PBS phosphate buffered saline

PCP Protein correlation profiling

PI phosphatidylinositol

xii

PMN Polymorphonuclear

PPI Protein-protein-interaction

PTRF/Cavin1 Polymerase 1 and transcript release factor

RPM Revolutions per minute

RSD Relative standard deviation

RT Room temperature

SCV Salmonella containing vacuole

SDS Sodium dodecyl sulfate

SEC Size exclusion chromatography

SIF Salmonella-induced filament

SILAC Stable isotope labeling by amino acids in cell culture

Sip Salmonella inner protein

Sop Salmonella outer protein

SPI-1 Salmonella pathogenicity island 1

SPI-2 Salmonella pathogenicity island 2

StageTip Stop-and-go extraction tip

T3SS Type III secretion system

TAP Tandem affinity purification

TLR Toll-like receptors

WT Wild type

XIC Extracted ion current

xiii

Acknowledgements

Domo arigatou (thank you) to my supervisor Dr. Leonard Foster for his superb

guidance and providing me a challenging project that allowed me to develop skills in

multiple disciplines (biochemistry, cell biology, microbiology, and bioinformatics).

Thank you to all members of the Foster lab, especially Dr. Anders Kristensen and Dr.

Nat Brown for mentoring me.

I would like to thank my committee members, Dr. Paul Pavlidis and Dr. Jim

Kronstad for their helpful discussions and comments. I’m grateful for the generous

funding provided by the GSAT program.

Finally, I’d like to thank the SJC community, my family and friends for all of

their encouragement and support.

xiv

Dedication

To my parents and Mrs. Sugaya-Jones

1

Chapter 1: Introduction

This thesis presents a global analysis of host-pathogen protein interactions by

quantitatively monitoring cytosolic protein complex dynamics in response to

Salmonella infection. The research is interdisciplinary, bringing together

biochemistry, cell biology, and bioinformatic approaches to address a microbiology

question. The introduction begins with a brief description of the protein interactome,

illustrating the ‘big picture’ of this thesis. The section covering basic Salmonella

biology provides the motivation for this research as well as necessary background for

the host-pathogen system to be studied. The remainder of the introduction describes

quantitative mass spectrometry and compares the PCP-SILAC method to other

proteomic techniques for studying protein complexes.

2

A. Origami subunit, B. Origami cube, C. Origami octahedron D. Origami stellated icosahedron E.

Origami inverted icosahedron. F. Crystal structure of HSP27 (PDB: 3q9p) within a protein protein

interaction network map generated from the STRING database v. 9.02.82

1.1 Mapping the interactome

Figure 1.1 Modular organization in origami and protein interactions

Proteins inside a cell interact with each other forming modular protein

complexes that carry out complex biological functions.1,2 The relationship between

modular organization and complexity is illustrated above in Figure 1.1. A simple

origami subunit when assembled with other similar subunits, can form progressively

larger and very intricate looking shapes. The Sonobe origami models from top to

bottom include a cube (B. 6 subunits), stellated octahedron (C. 12 subunits), and

stellated icosahedrons (D. and E. 30 subunits each). Models D and E are both made

from the same number of subunits, however one fold was reversed in the subunits,

3

demonstrating the impact a small modification of a subunit can have on the overall

structure of the model. This point will come up again when discussing the caveats of

using protein tags to purify protein complexes.

1.1.1 Traditional mapping tools

Over the past four decades, charting of protein-protein interactions by

traditional reductionist techniques has been a slow and laborious process. The yeast

two hybrid (Y2H) system was the first high throughput approach for protein-protein

interaction studies. In this technique, a yeast transcription factor is split into two

segments and fused to two proteins (bait and prey). A protein interaction between

the bait and prey proteins reconstitutes the functionality of the transcription factor

by bringing them in close proximity, and produces a product that can be selected for.3

Characterizing these complexes by mass spectrometry (MS) based proteomics,

which is a high throughput method for studying cellular protein complements,

provides the most comprehensive maps of protein-protein interactions. In 2006, two

groups concurrently published the first genome-wide characterizations of protein

complexes in Saccharomyces cerevisiae, with an identification of more than 30,000

proteins organized into around 500 complexes at an estimated 70% coverage. 4,5

Databases have been set up to store data on protein protein interactions, with the

goal of providing users the most comprehensive interactome maps. An example of a

protein-protein interaction map is displayed in the right panel of Figure 1 with HSP27

as the query protein, shown in the middle with its protein crystal structure. The

colored nodes represent other proteins HSP27 interacts with and the type of evidence

for that interaction is notated by the color of the lines.

4

1.2 Salmonella enterica

1.2.1 Classification

In the mid 19th century, waves of hog cholera outbreaks were sweeping through

the southern states of America, devastating the swine industry. A veterinary

pathologist by the name of Dr. Salmon isolated a strain of bacteria, Salmonella

choleraesuis, from the intestine of an infected pig.6 Salmonellae are Gram-negative,

rod-shaped bacilli ranging in size from 2-5 x 0.7-1.5 !m. They are motile with flagella

distributed around its entire body. The optimal growth temperature is between 35-37

ºC. As facultative anaerobes they are metabolically capable of thriving in a variety of

environments.7

Phylogenetically, Salmonella belongs to the Enterobacteriacae family of

bacteria along with Escherichia coli and Yersenia pestis. From genomic comparisons,

it has been estimated that Salmonella and E. coli diverged from a common ancestor

around 100 million years ago.8 The Salmonella genus is split into two species:

Salmonella enterica and Salmonella bongori. Salmonella enterica is further divided

into six subspecies (I,II, IIIa, IIIb, IV, VI). Salmonella enterica subspecies enterica (I)

invade warm-blooded animals while the other subspecies primarily invade cold-

blooded animals (with rare occurrences of human infection). Based on Salmonella’s

diverse outer structural antigens, the subspecies have been serotyped into over 2,500

serovars. Many serovars have adapted to a particular host, for example Salmonella

Typhi only infects humans and Salmonella Dublin is restricted to cattle. In contrast,

Salmonella Typhimurium has a broad host range.9

5

1.2.2 Human health

Infections by the foodborne pathogen Salmonella (salmonellosis) cause a

spectrum of clinical diseases. Enteric fever (typhoid fever) is a systemic, long lasting

infection characterized by abdominal pain, fever, rashes and diarrhea after ingestion

of S. enterica serovars Typhi or Paratyphi. Cases occur most frequently in regions of

poor sanitation and have a mortality rate of 10-15% if left untreated.!"#$ Bacteremia

(infection of the bloodstream) is another life threatening disease that accounts for

about 8% of untreated salmonellosis cases.7 Gastroenteritis is a self-limited form of

salmonellosis caused by nontyphoidal serotypes such as S. enterica serovar Enteriditis

and S. enterica serovar Typhimurium. Symptoms of gastroenteritis occur 6-48 hours

after ingestion of contaminated food or water and include abdominal pain, vomiting,

and diarhhoea.9,11 Antibiotics are a common treatment for salmonellosis, which has

led to a rise of multidrug resistant strains.

Salmonella is transmitted through ingestion of contaminated foods (meats,

eggs, produce), contaminated water, and contact with animal carriers of Salmonella

(humans can also be chronic carriers, e.g., Typhoid Mary). A food inspection study of

ground meat reported that 26% of chicken, 18% of turkey, and 3% of beef tested

positive for Salmonella9. The current global estimates of salmonellosis are 16 million

enteric fever and 1.3 billion gastroenteritis cases per year with 3 million deaths.7 In

the United States, the incidence rate of nontyphoidal salmonellosis has doubled in the

last two decades.9 New drug targets for combating salmonellosis are in high demand

globally.

6

To invade epithelial cells, Salmonella must first make contact with the outer membrane through

reversible adhesion with membrane receptors. Salmonella uses a type three secretion system

(yellow line) to inject effector proteins into the host cell that interfere with host signaling

pathways. Host membrane ruffling allows Salmonella entry into the host cytosol within a

vacuole, where it can multiply hidden from host defense mechanisms.

1.2.3 Pathogenesis

On the outer surface, Salmonella express several fimbrial adhesins that help

deliver the bacterium to the intestinal epithelium10. Microfold cells (M cells) are

interspersed with intestinal epithelial cells at lymphoid follicles above Peyer’s

patches. They have an increased rate of pinocytosis, which allows Peyer’s patches to

sample foreign antigens from the lumen and prime developing lymphoblasts.

Salmonella exploits this immune surveillance function of the gut by targeting M cells

for infection.#%

Invasion begins by injecting bacterial virulence factors (effectors) into the

cell. The effectors hijack the host cell machinery, rearrange host cell architecture

and induce membrane ruffling at the point of contact. Salmonella gains entry into

the host cell within a membrane vacuole, the Salmonella containing vacuole (SCV).

During later stages of infection, Salmonella can replicate within the SCV and cause

inflammation leading to further dissemination within the host (see Figure 1.2).

Figure 1.2 Salmonella invasion

7

Salmonella triggers the host inflammatory response as a strategy to

outcompete resident microbiota of the human gut. One highly reactive and toxic

compound produced in large quantities by the gut microbiota is hydrogen sulfide

(H2S). The host quickly converts H2S to a more stable compound, thiosulphate (S2O32-).

Nitric oxide radicals and reactive oxygen species released during inflammation

oxidizes thiosulphate into tetrathionate (S4O62-), a compound which Salmonella has

the unique ability to reduce as an energy source.#&"#'

Virulence genes, acquired by horizontal gene transfer, are clustered into two

locations on the bacterial chromosome that are referred to as Salmonella

Pathogenicity Islands I and II (SPI-1 and SPI-2). SPI1 encodes effectors for invasion of

host cell and SPI2 effectors are for replication within the SCV.#$,#( Salmonella injects

effectors into the host by a needle complex, referred to as a type three secretion

system (T3SS). Effectors flagged for secretion contain a signal near their N-terminal

which can bind to chaperone proteins to aid in their delivery.8

Downstream targets have been traced for a handful of Salmonella effectors. A

burst of SipA effectors translocated into the host, (10^3 SipA /bacterium) helps

rearrange host architecture by binding to actin filaments.16 SipA also attracts

polymorphonuclear cells (PMN ) to sites of infection by activating PKC alpha signalling

pathway to turn on expression for chemokine, IL-815

Membrane ruffling of the host cell is a reversible process mediated by

monomeric G proteins involved with cell morphology and motility. These Rho family G

proteins (Cdc41, Rac1, Rho) are hijacked temporally by the effectors SopE, a guanine

nucleotide exchange factor, SptP, a GTPase activating protein, and SopB, an inositol

8

polyphosphate phosphatase. Once inside the host cell, SopE has a shorter half life

than the others, which then allows SptP an open window to reverse the actions of

SopB and SopE15,17

1.2.4 Proteomics and the host pathogen frontier

Developments in the proteomic field have expanded the scope of host-

pathogen interaction studies, providing an unbiased global view of the proteome

under different conditions. A mass spectrometry analysis of the Salmonella

secretome identified six new T3SS effectors, and estimated Salmonella Typhimurium

to have as many as 300 T3SS effectors.15 The first steps towards elucidating

proteome-wide phosphorylation and ubiquination signaling pathways of host pathogen

systems have been taken using purification enrichment methods.8,18 Assembling a

coherent story from different types of proteomic studies is challenging, and will

require a systems biology model with additional layers of omic data. The

development of host-pathogen interaction models will be key to designing novel

antimicrobial therapies that target a specific pathogen and doesn’t disrupt the host’s

microflora.10

1.3 Quantitative mass spectrometry based proteomics

1.3.1 The proteome

The genomics revolution laid the foundation for a number of other ‘omic fields,

including proteomics (the study of the proteome). Before the completion of the

human genome project, the proteomes of simple model organisms such as

9

Mycoplasma genitallium were mapped. In these early studies, the proteome was

referred to as, “The total protein complement able to be encoded by a given

genome”.19 Technological advances allowed scientists to dig deeper into the vast

proteome of a number of model organisms, cataloguing proteins in a high-throughput

mode. The original definition of a proteome quickly become outdated as it does not

reflect the dynamic and complex behaviour of proteins in a living organism. Today the

proteome is defined as the protein complement of a specified cell type in reference

to time and includes splice isoforms as well as any modifications20.

1.3.2 Instrumentation

Proteomes were traditionally characterized by protein microarrays and protein

staining with two-dimensional gel electrophoresis (2DE); the latter is a technique that

resolves a complex protein sample in two dimensions based on each protein’s size and

isolectric point. These technologies, especially 2DE, suffered from poor

reproducibility and were limited to characterizing the most abundant proteins. In the

late 1980s, breakthrough technological advances in adapting mass spectrometry to

biomolecules overcame the limitations of competing technologies and pushed mass

spectrometry to the forefront of the proteomic field.

A mass spectrometer is a versatile instrument that generates electrical and

magnetic fields to control movements of gas phase ions under vacuum. The velocity of

an ion in an electromagnetic field is directly proportional to an ion’s charge state and

inversely proportional to its mass. The three major components of a mass

spectrometer are an ionization source, a mass analyzer, and a detector. Ions

generated by the ionization source are spatially separated by an analyzer that

10

separates them based on their mass-to-charge (M/Z) ratios. Selected ions are then

passed towards the detector where they are registered. A mass spectrum is a plot of

ion intensities versus M/Z (measured in Thompsons) which can be used to calculate

the mass of a molecule.21

1.3.3 Protein identification

Whole genome sequencing projects unlocked the codes necessary for proteome

identification by mass spectrometry as theoretical masses could be calculated and

stored in a searchable database. Proteins in their native state are challenging to

analyze by mass spectrometry because of their size and polarity. Site-specific

proteases can reduce these barriers by breaking proteins down into peptides.

Proteomic analysis by mass spectrometry began in the 1980s with the development of

two novel soft-ionization techniques. Matrix assisted laser desorption ionization

(MALDI) sublimates and ionizes peptides from a solid crystalline state. The second

and more widely used technique is electro spray ionization (ESI), which vaporizes and

ionizes peptides into multiply charged ions (typically cations since the initial solution

is acidic). ESI is well suited for high throughput analyses of proteomes since it can be

coupled to upstream peptide separation by nano flow liquid chromatography.22

Hybrid mass spectrometers designed for proteomics contain multiple mass

analyzers to extract additional information from peptide ions. The first mass analyzer

has the highest resolution and selects a specific M/Z (parent ion) from the mixture of

peptides eluting from the liquid chromatography column at that time. The parent ion

is accelerated into a chamber filled with inert gas, causing the peptide to fragment

along the peptide backbone, a process termed collision induced dissociation (CID).

11

Cells are harvested and complex protein mixtures are digested into peptides by proteases. Peptides are

separated by reverse phase chromatography, ionized, and analyzed inside a mass spectrometer.

Peptides to be sequenced are isolated, fragmented, and analyzed by a parallel mass spectrometer. Raw

MS data are processed by software for identification and quantitation of proteins

Fragment ions are measured (tandem MS or MS/MS) and provide sequence information

to help identify the peptide. In the downstream data analysis, protein identifications

are made by algorithms that search the protein sequence database for peptides that

match the experimental MS and MS/MS spectra. An overview of the proteomic

workflow is illustrated in Figure 1.3.

Figure 1.3 MS-based shotgun proteomics

12

1.3.4 Quantitative techniques

It is often necessary to add a quantitative dimension to a proteomic analysis.

Biochemical properties such as length, side chains, post translational modifications

(PTM) and charge state can all contribute to differences in ionization efficiency of a

peptide.23 Thus, due to the stochastic nature of peptide ionization, signal intensity is

not always an accurate measure of protein abundance.

Label-free methods have a high dynamic range, are global, and offer

quantitation accuracies comparable to protein staining (>30 % relative standard

devation)22,23. The two most common label-free methods for calculating protein

abundance are integration of peptide precursor peak signals and counting of peptide

fragment spectra. In the first approach high resolution MS and reproducible

chromatography are critically important for the analysis. Integration of precursor

peaks generates an extracted ion currents (XIC) for a peptide, which can then be

compared to other XIC of the same peptide in different samples. Protein abundance

roughly correlates with the amount of MS/MS fragment spectra generated. This led to

the development of a protein abundance index (PAI), which is the count of observed

spectra divided by the number of theoretically observable peptides for a protein. A

few variants exist, such as the exponentially modified PAI, emPAI, which takes the PAI

as an exponent of base 10.24

The use of stable isotopes (e.g., 1H vs. 2H, 12C vs. 13C, 15N, or 18O) to

differentially label samples allows a direct measure of the relative abundances of

proteins/peptides between the light (naturally occurring) and heavy (isotopically

enriched) forms. In the 1970’s labeling methods based upon the concept of stable

13

isotope dilution were applied in clinical chemistry and pharmacokinetics.25 Today,

stable isotope labels are widely used in proteomics and a number of techniques have

been developed to incorporate the labels into the proteomic workflow.

Proteomic samples be labeled at the protein level or downstream at the

peptide level through metabolic, enzymatic, or chemical reactions. Quantitative

accuracies are generally higher when samples are labeled at the protein versus

peptide level since mixing of the two samples occurs further upstream in the

proteomic workflow and can reduce experimental error. Stable isotope labeling by

amino acids in cell culture (SILAC) incorporates heavy amino acids into proteins at the

time they are synthesized by the ribosome. Cell culture media deficient in essential

amino acids are supplemented with isotopic analogues (most commonly arginine and

lysine are chosen since together they are found in virtually all tryptic peptides). After

five to ten generations, labeled proteins in a cell population have nearly complete

incorporation of the heavy or medium amino acids and have a characteristic Lys4, Arg

6 (medium) or Lys 8, Arg10 (heavy) shift in the MS. Relative comparisons can be made

about the proteome by identifying and quantifying isotope pairs.26 Absolute

quantification can be reached by spiking in labeled internal standards, which is

typically reserved for targeted studies due to technical difficulties in generating such

standards in high-throughput.

1.4 Proteomic techniques for analysis of protein complexes

Two inherent properties of signaling complexes that make them challenging to

work with are their transient nature and low abundance. These bottlenecks have

14

driven the development of a variety of novel proteomic techniques, which can broadly

be classified as being either tag or chemical cross-link based. These developments

will be framed in context of their strengths and weaknesses.

In a standard proteomic workflow, one has to enrich the proteins of

interest. For signaling complexes, the goal is to purify the target structure with a

minimal amount of contamination, as well as maintaining the interactions of the

native binding partners.

Finding this balance is how the two main classes of strategies are divided in

this paper, as depicted in Figure 1.7. One class utilizes epitope tags as handles to

enrich complexes and thereby aid in identifying the components. When such affinity

purification is coupled with quantitative proteomic techniques, the stoichiometry of

the protein complex subunits can be accurately determined. The second class uses

chemical cross-linking with mass spectrometry, which adds a spatial organization of

Figure 1.4 Mass spectrometry approaches for analyzing protein complexes

Affinity purification of tagged proteins and chemical cross-linking to stabilize complexes are two

common approaches for studying protein complexes, their advantages are highlighted in dark gray

15

the signaling complexes. By combining both classes of strategies, signaling complexes

can be reconstructed and their functions predicted.

1.4.1 Tandem affinity purification (TAP)

Epitope-tagging is a widely used method for isolation of protein complexes out

of cell extracts. In this approach, a bait protein is genetically fused with an epitope

tag expressed in the host cell. The interacting partners of a complex can

subsequently be captured by binding of the epitope tag to an affinity matrix. For

example, a protein fused to Protein A, which is an epitope derived from

Staphylococcus aureus, can be purified with an anti-Protein A antibody.

Although one-step affinity purification can be an effective strategy for isolation

of multiprotein complexes, non-specific binding can be quite common. To achieve

higher purity, a dual purification strategy termed tandem affinity purification (TAP)

was developed by Rigaut27. The original TAP-tag construct system included protein A

and a calmodulin binding peptide as tandem tags with a tobacco etch virus protease

cleavage site in between the tags. The tag cassette has the flexibility of being fused

to either the N or C-terminal ends of a target protein to obtain the optimal expression

of the fused protein in a host cell.

The first purification step of the fusion protein and its associated components

involves binding of Protein A to an IgG matrix. After gentle washing, purified

complexes are released by the tobacco etch virus enzyme cleavage while

nonspecifically bound proteins are left behind. In the second purification step, the

eluate of the first affinity step is incubated with calmodulin-coated beads in the

16

presence of calcium. Following another wash step, the fusion protein and its binding

partners are specifically released via calcium chelation. Again, proteins that interact

nonspecifically with the support matrix are left behind.28

A wave of new tandem affinity tag combinations are currently available

commercially. The choice of tag combination is heavily influenced by the

biochemistry of the model organism being studied. For example, higher eukaryotic

cells naturally express high levels of calmodulin and calmodulin-binding proteins and

that can interfere with the binding of the calmodulin binding peptide tag to the

resins. For this reason, mammals and plants generally use alternative tags such as the

streptavidin-binding peptide tag (GS-TAP tag). Another important consideration is the

size of the tag. Bulky tags are more likely to interfere with the biological function of

the tagged protein, such as protein folding and recruitment to protein complexes.

The tags taking advantage of the high biotin and streptavidin binding affinities are

much smaller compared to the original TAP tag.29

Epitope tags and their cognate antibodies have an advantage over traditional

immunoprecipitations using antibodies against the protein of interest itself in that

they avoid the need for specific antibodies for every target protein of interest. This

method has been widely used in both targeted and large-scale analysis of protein

complexes. Some disadvantages include the time it takes, issues with expressing the

tags at physiological concentrations, tagging artifacts mentioned earlier, and being

limited to complexes with high affinity.29 To address these drawbacks, Kristensen of

the Foster lab developed a new tag free method for protein complex analysis that is

faster and adds a quantitative dimension.

17

1.4.2 High-throughput protein complex profiling

Protein complexes are now being analyzed on a global scale by using tag-free

approaches that separate complex mixtures of endogenous protein complexes into a

set of fractions by gentle, non-denaturing techniques like size exclusion

chromatography (SEC), ion-exchange chromatography (IEX), and blue native

polyacrylamide gel electrophoresis (BN-PAGE). Fractions are analyzed by mass

spectrometry, and then characteristic profile plots of comigrating proteins are

clustered to reconstruct protein complexes.

Initial ‘tagless’ studies of protein complexes suffered from poor yields, for

example a study from 2007 of cytosolic E. coli protein complexes used a three step

chromatography separation (IEX, HIC, and SEC) and identified 103 proteins and 13

protein complexes30. Last year Emili’s group published a study with extensive IEX and

sucrose fractionation (1,163 fractions) for a reported 13,993 cocomplex interactions,

3,006 human proteins with an estimated 21.5% FDR31. These numbers reflect

improvements in chromatography and mass spectrometry over the past five years.

Blue native PAGE gels are able to resolve large labile membrane protein

complexes of mitochondria, up to 30 MDa. Heide et al. identified 464 mitochondrial

proteins and assigned new members to a previously characterized protein complex32.

Membrane complexes require careful sample handling, since solubilizing proteins from

the membrane can lead to disassembly of the protein complex. Digitonin has had the

most success in recent years for this application.

18

Using a technique called protein correlation profiling across SEC with SILAC

(PCP-SILAC), dynamic changes of cytosolic protein complexes in response to stimuli

can now be monitored. This method takes less than two percent of the time of

conventional AP-MS approaches for profiling protein complexes, thus opening the door

for testing a wide variety of stimuli33.

In the PCP-SILAC experimental workflow, triplex SILAC cells (light, medium,

and heavy) are grown, and heavy labeled cells are treated with a compound or

infection. Cell lysates are separated into 50 fractions by high-performance liquid

chromatography with a SEC having an optimal resolution between 150 Kda and 2 Mda.

After fractionation, light samples are pooled together and added to each of the

medium/heavy fractions, serving as internal standards for quantification. Each

fraction mixture is then tryptic digested and analyzed by tandem mass spectrometry.

Hierarchical clustering of the chromatograms led to the identification of 291

complexes based on an empirically determined distance threshold. Kristensen et al.

were able to identify the chaperonin complex, a complex that hasn't been identified

with the TAG strategy. The resolution of these profiles allows the distinction

between assembled and non-assembled proteasome complexes.33 Stimulation by EGF

for 20 minutes led to an increase or decrease of association to complexes for 351

proteins.

The main advantage of using SILAC for high throughput protein complex

profiling is the ability to globally measure dynamic responses of the cell to stimuli.

Lamond’s group recently published a paper using a similar SEC approach for

separating protein complexes and identified nearly double the number of proteins

19

compared to PCP-SILAC, however they used spectral counting for MS quantization and

therefore analyzed a static protein interactome34. A limitation of SILAC is the

prerequisite of a compatible cell line that can be fully labeled by the isotope fed

media.

1.4.3 Chemical cross-linking in living systems

Up to this point, the focus of this section has been on isolating protein

complexes with biochemical techniques. The second class of techniques involves

chemical cross- linking strategies. The idea is to use a cross-linker to capture a

snapshot of the cell. The chemical bonds formed from the cross-linker help stabilize

the complex and allows more stringent washing conditions to lower background

binding.

There are a variety of cross-linkers available with different spacer length and

functional groups. Formaldehyde is an attractive option because it cross-links only

closely associated proteins, has a high permeability towards cell membranes, and is

cheap. Currently, the field is still at the very early stages of using formaldehyde for

protein-protein interaction studies. A study conducted by the Kast group address

some important issues with formaldehyde and provide optimized experimental

conditions for integrin Beta 1 (B1). In a basic workflow, cells are treated with

formaldehyde, lysed and protein complexes are precipitated by antibodies. One

concern was that formaldehyde might disrupt the epitope tag and prevent

precipitation. They tested this hypothesis by precipitating Integrin B1 cross-linked

complexes that have epitopes with varying numbers of amino acids that can be

20

modified by formaldehyde. They concluded that under the cross-linking conditions

they tested, this was not a problem. They provided optimal experimental conditions,

such as formaldehyde concentration, length of incubation and temperature for

running the gels. They demonstrated with gels that formaldehyde complexes were

preserved if samples were only incubated at 65 degrees, whereas most of the cross-

links were reversed at 99 degrees Celsius35.

A novel cross-linker that has potential applications for protein complex studies

is a photocleavable protein interaction reporter that provides both identification of

interacting proteins and spatial details about the binding domain. The structure

contains two reactive groups next to photocleavable groups and a reporter with an

affinity tag. Directing UV light onto the cross-linked complex allows fragmentation

into two peptides and the reporter.

In a study demonstrating the proof of this concept, Zhang et al. cross-linked E.

coli proteins in vivo prior to lysing the cells and digesting the proteins into peptides.

An avidin-biotin affinity purification was used to enrich the cross-linked products.

Photocleavage was performed by a UV laser that was focused on the sample in a

capillary. Samples were then analyzed by tandem MS and identified with in house

software. They identified 114 inter and intra protein interactions, 38 which had been

previously reported in the E. coli interaction database.36

The attractive aspects of chemical cross-linking for the analysis of multiprotein

complexes are the spatial information about where proteins bind and an increased

purity of the sample. The trade offs are spectra that are more complex requiring

21

sophisticated software for data analysis and the time to work out all the optimal

conditions, such as concentration of cross-linker reagent.

1.4.4 Validation

Currently there is not a strict standardized method for validating protein

complexes. In these types of large-scale studies, individual interactions are typically

not quality controlled or validated. Therefore, the results almost certainly contain

false positive interactions arising from spurious interactions or false negatives arising

from missed interactions. Generally within a paper studying protein complexes, the

authors will make some reference as to how their identifications compare to the

known interactors from a literature-curated reference of interactions, such as the

CORUM database37.

Although a number of signaling complexes have been successfully identified

and studied by proteomic techniques, a comprehensive analysis of signaling

complexes is currently unattainable due to several bottlenecks. First, the proteins

comprising most signaling complexes are typically expressed near the lower end of

the abundance range, making them harder to identify. Typically MS identification of

signaling complexes start from 108 cells. For dynamic or low abundant protein

complexes, scaling up would require hundreds of flasks of monolayer cells to be

cultured. Second, the interactions holding signaling complexes together can be weak

and thus may not always survive affinity purification. Cross-linking can help alleviate

this problem to a degree, but when functional groups of the proteins being cross-

linked are not positioned correctly in the interacting interface, a cross-link can't be

22

made. Third, their assembly and disassembly occurs rapidly. For example, T cell

receptors recruit different proteins within 15 s of receptor activation which can't be

captured by the current methods.38 To overcome these obstacles in the short term,

proteomic studies can complement their data with alternative technologies, such as

electron microscopy and cellular electron tomograms. 39

23

Chapter 2: Materials and methods

2.1 General solutions and buffers

ABC buffer

50 mM ammonium bi-carbonate (NH4HCO3, ABC) in water (pH8.0). Stored at RT.

Reduction buffer

10 mM dithiothreitol (DTT) in 50 mM ABC buffer. Stored in small aliquots at -20 C.

Alkylation buffer

55 mM iodoacetamide in 50 mM ABC buffer. Stored in small aliquots at -20 C in the

dark.

Buffer A (starting mobile phase for LC-MS/MS)

0.5% acetic acid in water. Stored at RT.

Buffer B (ending mobile phase for LC-MS/MS)

0.5% acetic acid in water, 80% acetonitrile in water. Stored at RT.

2.2 Cell culture

HeLa cells (American Type Culture Collection) were cultured in a humidified

incubator at 37 °C in the presence of 5% CO2 with Dulbecco’s Modified Eagle’s Medium

(DMEM) (Caisson Laboratories Inc.) supplemented with 10% (v/v) dialyzed fetal bovine

serum (dFBS) (Invitrogen), 2 mM glutamine (Thermo Fisher Scientific), and 100 U/mL

penicillin/streptomycin antibiotics (Thermo Fisher Scientific).

24

2.2.1 SILAC labeling

For SILAC labeling of HeLa cells, DMEM media lacking arginine and lysine were

enriched by adding the following: (1) L-arginine (22 mg/L) and L-lysine (38 mg/L)

(Sigma-Aldrich, Oakville, ON) for “light” labelled cells, (2) 13C6 L-Arginine (22 mg/L)

and D4 L-Lysine (38 mg/L) (Cambridge Isotope Laboratories, Andover, MA) for

“medium” labeled cells, and (3) 13C6 15N4 L-Arginine (22 mg/L) and 13C6 15N2 L-Lysine

(46 mg/L) (Cambridge Isotope Laboratories, Andover, MA) for “heavy” labeled cells.

HeLa cells were split at a 1:4 dilution into the three SILAC media formulations and

passaged five times for complete replacement of labeled amino acids into proteins.

Arginine and lysine are the most common amino acids for labeling cells in proteomic

experiments because the protease trypsin cleaves at the carboxy-termini of arginine

and lysine (thus producing ideal peptides for quantitation)26,40.

2.3 Sample preparation

2.3.1 Salmonella infection

Prior to Salmonella infection, all cell cultures (five 15 cm dishes per SILAC

population) were serum starved for 20 h by washing cells with phosphate buffer saline

(PBS) two times and plating cells in SILAC DMEM containing no antibiotics or fetal

bovine serum. An overnight culture of wild-type Salmonella enterica serovar

Typhimurium strain SL1344, was subcultured (1:33) for 3 h at 35 °C. The Salmonella

inoculum was prepared by pelleting the bacteria at 10,000 relative centrifugal force

(rcf) for 2 min at RT and resuspending cells in antibiotic-free DMEM at a multiplicity

25

of infection (MOI) of 200. Heavy labeled cells were incubated with the Salmonella

inoculum for 20 min at 37 °C.

2.3.2 Cell harvesting

After infection, cells were immediately placed on ice. Cells were washed three

times with cold PBS and harvested with a scraper. Harvested cells having the same

SILAC label were pooled, pelleted for 4 min at 550 rcf at 4 °C and resuspended in 2

mL of size exclusion chromatography (SEC) mobile phase (50 mM KCl, 50 mM

NaCH3COO, pH 7.2) containing complete protease inhibitor cocktail without EDTA

(Roche) and additional phosphatase inhibitors (5 mM Na4P2O7, 0.5 mM pervanadate).

Cells were lysed by 200 strokes with a Dounce homogenizer and concentrated with a

spin column (100 kDa MW cutoff, Sartorius Stedim).

2.3.3 Size exclusion chromatography of protein complexes

Cell lysates from the heavy labeled cells (Salmonella infected) were combined

with the medium labeled cells right before separation by size exclusion

chromatography. Samples were loaded onto a 600 x 7.8 mm BioSep4000 Column

(Phenomenex) and separated into 80 fractions by a 1200 Series semi-preparative HPLC

(Agilent Technologies, Santa Clara, CA) at a flow rate of 0.5 mL/min at 8°C. The

fractions from the light SILAC populations served as an internal standard and were

separated by SEC independently of the medium/heavy samples. The fractions to be

analyzed by MS (first 45 fractions) were pooled together and spiked into each of the

corresponding medium/heavy fractions at a volume of 1:1.

26

2.3.4 In solution digestion of protein complexes

Sodium deoxycholate was added to each fraction to a final concentration of

1.0% (v/v) then each sample was boiled for 5 min. Protein samples were reduced for

30 min at RT in 10 mM dithiothreitol (DTT) solution followed by alkylation for 20 min

by 55 mM iodacetamide (IAA) in the dark at RT. Sequence grade trypsin (Promega;

protein:enzyme concentration 50:1) was added to each sample and incubated

overnight at 37 °C. Peptides were acidified to pH < 3 with acetic acid and cholic acid

was pelleted by spinning at 16,000 rcf for 10 min.

2.3.5 StageTip purification

Stop-and-go-extraction tips (StageTips)41 were self-made by punching out two

small disks of C18 Empore material (3M) using a 22G syringe and packing them at the

end of a 200 !L pipette tip. The StageTips were conditioned with methanol and

equilibrated with Buffer A. The in-solution peptides were acidified to pH < 3 with

acetic acid. Peptides were loaded onto the column with Buffer A by centrifugaton at a

maximum speed of less than 5,000 rpm. Peptides were washed once with Buffer B,

and eluted from the column with 30 !L of Buffer B directly into a HPLC autosampler

plate. Samples were concentrated in a vacuum concentrator and resuspended in 10 !L

Buffer A.

27

2.4 Mass spectrometric methods

2.4.1 LC-MS/MS

Peptides from each sample were separated by a 180 min gradient (5-35%

acetonitrile in 0.5% aqueous acetic acid) using an in-house packed C-18 analytical

column (200 mm length 75 !m I.D.), packed with 3.0 !m-diameter ReproSil-Pur C-18-

AQ beads (Dr. Maisch, www.Dr-Maisch.com). Peptides were eluted from the column

and electrosprayed into a linear-trapping quadrupole – Orbitrap mass spectrometer

(LTQ-Orbitrap Velos; Thermo Fisher). The LTQ-Orbitrap was operated with the

following settings: one full precursor scan in the Orbitrap (resolution 60,000; 350-

1,600 Th). The top ten most intense peptide ions were selected for simultaneous

fragmentation by collision-induced dissociation and top five by HCD (resolution 7500)

in each cycle in the LTQ. The LTQ was operated with the following settings: minimum

signal intensity 1000 counts, singly charged ions were excluded, and parent ions were

excluded from MS/MS for the next 30 sec.

2.5 Bioinformatics and statistical analysis

2.5.1 Database searching and quantitation

The acquired spectra were analyzed by the MaxQuant software (v1.1.1.36)42.

Isotope clusters and SILAC doublets/triplets were extracted from the RAW data files

and quantified. A database of the most recent host-pathogen protein sequences were

compiled from the human International Protein Index protein sequence database (IPI

28

human v3.68) and UniProt/Swiss-Prot (11/7/2011) Salmonella Typhimurium (total

89753 protein sequences). Max Quant’s Andromeda43 algorithm was used to identify

proteins with the following search parameters: carbamidomethylation of cysteine as a

fixed modification; oxidation of methionine, acetylation of protein N-terminal and

SILAC labeling as variable modifications; trypsin/P cleavage with a maximum of 2

missed cleavages, 0.5 Da mass tolerance for MS/MS. False discovery rates were

estimated by searching against a reversed sequence concatenated target-decoy

database44. A maximum false discovery rate of 1.0% at both the protein and peptide

level were accepted for protein identifications.

2.5.2 Visualizing protein chromatograms with R

R (version 2.11.1) scripts were used to plot thousands of M/L and H/L protein

profiles to serve as a visual reference and for inspecting the raw data quickly.

Plotting the individual profiles together in a 10x4 grid reduced the pages needed to

display all the profiles of one replicate. In generating the ‘landscape’ view, plotting

all the protein profiles together using the default settings is not visually informative

as the result is one thick line. To overcome this, the transparency (alpha channel) was

adjusted. This is the two digits appended at the end of a hexadecimal color code. For

example, in Figure 3.3 the line color was set to col="#00000025". By adjusting the

transparency, thousands of profiles can be plotted together, and features of the

profile landscape can be seen.

29

2.5.3 Filtering protein chromatograms

Chromatogram (M/L) ratio profiles across the 45 fractions were filtered in a

two-step algorithm. 1.) Data for a given protein was retained where there were

quantified ratios in at least three consecutive SEC fractions and 2.) at least one of the

three data points was greater than a specified minimum threshold value (0.1-.2).

2.5.4 Clustering protein chromatograms

The R package ‘gplots’ (version 2.11.3) was used to hierarchically cluster

filtered protein profiles and display them in a heatmap. The CORUM database of

protein complexes was used for annotating the heatmap with well characterized

protein complexes.37

2.5.5 Protein complex enrichment analysis

The COMPLex Enrichment Analysis Tool (COMPLEAT)45 contains a comprehensive

resource of human protein complexes (3,638 compiled from literature sources and

6,251 predicted for a combined total of 9,293 human protein complexes). The tool

was developed to analyze high-throughput genomic and proteomic datasets without

the need of preselecting hits. A complex score (ciqm) is calculated by mapping the

data to protein complexes, sorting the data from highest to lowest, then calculating

the interquartile mean of the data. A p-value is also computed by comparing the ciqm

score to 1000 random complexes of the same size.

30

Protein complex enrichment was performed with COMPLEAT by analyzing each

fraction of PCP-SILAC separately. Protein complexes were included in the list of

enriched complexes if they were in at least two of the three biological replicates

using a p-value threshold of 0.05.

2.5.6 Differential protein changes

Three independent biological replicates of PCP SILAC were generated, and

all ratios were converted to log2 values. The protein profiles of two abundant

complexes (proteasome and 14-3-3) were plotted in excel for all replicates and peaks

were aligned manually by shifting the whole dataset left or right in increments of one

fraction. Differential protein changes were calculated by three methods: fold

change, Z-test, and linear modeling with a moderated t-test.

Fold Change H/M ratios of 1.5-fold change (FC) were identified. Proteins

having at least two replicates with H/M ratios above this FC threshold cutoff were

considered significant.

Z-Test H/M ratios across three replicates were tested in each fraction

independently with the null hypothesis that the average H/M raio was equal to 0 using

Microsoft Excel’s two-tailed Z-test function. Multiple hyppthesis testing was

accounted for using a FDR method and q value threshold of 0.05.

Linear Modeling M/L and H/L ratios for three replicates were tested in the R

environment with the limma package46. Data were fitted to a simple linear model and

tested with Student’s t test, the null hypothesis that the average ratio is 0. The

empirical Bayes moderated t-test was applied to each t statistic. Multiple hypothesis

31

testing was accounted for across protein chromatograms using Storey and Tibshirani

FDR method and q value cut-off of 0.05.47

2.5.7 GO analysis of changing proteins

Dynamic proteins were investigated further with DAVID Bioinformatics Database

(DAVID Bioinformatics Resource v 6.7)48,49. Calculations of GO term over-

representation was performed with IPI identifiers comparing dynamic proteins

(identified by the z-test, q-value threshold of 0.05) to the entire list of identified

proteins as background. A p-value threshold of 0.01 was used for the analysis.

2.6 S. Typhimurium strains

Table 2.1 Salmonella strains used in this thesis Strains

S. enterica sv. Typhimurium

Genotype Source or

reference

SL1344 wild-type, SmR Boyle et al.50

"sopB SL1344 SL1344 sopB- Boyle et al.50

"sptP SL1344 SL1344 sptP- Boyle et al.50

"InvA SL1344 SL1344 invA- Boyle et al.50

"sopD SL1344 SL1344 sopD- Boyle et al.50

"sipA"sopE"sopE2 SL1344 SL1344, sipA-,sopE-, sopE2-, Boyle et al.50

RFP SL1344 wild-type, SmR, AmpR Zheng, YL51

RFP "sopB SL1344 SL1344 sopB-, AmpR This work

RFP "sptP SL1344 SL1344 sptP-, AmpR This work

RFP "InvA SL1344 SL1344 invA-, AmpR This work

RFP "sopD SL1344 SL1344 sopD-, AmpR This work

RFP "sipA"sopE"sopE2 SL1344 SL1344, sipA-,sopE-, sopE2-, AmpR This work

32

2.6.1 RFP gene transfection of S. Typhimurium

S. Typhimurium SLI1344 and mutants (see Table 2.1) were grown in liquid LB at

37°C for 2.5 hrs with shaking. Salmonella were pelleted at 4000 g for 10 min and

resuspended in 250 !L of 15 % ice cold glycerol solution. Electrocompetent Salmonella

(45 uL/transformation) were mixed with 0.2 uL RFP plasmid in a prechilled cuvette.

Salmonella were electroporated using a Gene Pulser apparatus (BioRad) following the

manufacture’s instructions. Electroporation was done at 2500 V and 200 # resistance.

2.6.2 Salmonella protein synthesis assay

HeLa cells were seeded in 96-well tissue culture plates and grown overnight.

RFP expressing Salmonella strains (listed in Table 2.1) were used to infect HeLa cells

at an MOI of 50 for 30 minutes at 37°C. Protein synthesis was measured by AHA

incorporation (50 !M AHA) using Click-iT AHA Alexa Fluor 488 Protein Synthesis HCS

Assay (Invitrogen) following the manufacturer’s instructions.

33

Three populations of SILAC HeLa cells are grown. Heavy cells are infected with Salmonella and

medium cells act as a control. Cells are harvested, lysed with a Dounce homogenizer, and the

lysate is spun at high speed to remove cellular debris. Protein complexes enriched away from

most monomeric proteins using a 100,000 molecular weight cutoff filter. Equal volumes of

medium and heavy protein complexes are mixed and fractionated by SEC. Light cells serve as an

internal standard that are spiked into each H/M fraction before analysis by LC-MS/MS.

Chapter 3: Results

Global interactome studies have traditionally been investigated with Y2H and

TAP-TAG technologies, costly methods that require extensive labor for cloning and

expressing thousands of tagged proteins. Recently, a quantitative proteomics

approach for interactome studies was designed to be fast, affordable, and avoid tags

or chemical cross link agents. This proteomic interactome method, PCP-SILAC, takes

advantage of the high resolving power of the latest size exclusion chromatography

technology for large biomolecules and the accurate and quantitative properties of LC-

MS/MS SILAC technology, (a schematic of the experimental workflow for PCP-SILAC is

shown below in Figure 3.1).

Figure 3.1 Experimental workflow for PCP SILAC with Salmonella infection.

34

Raw MS/MS sequence data from 50 SEC fractions are processed with Max Quant for identification and

quantitation of proteins. Protein profiles are generated from the M/L ratios. Data are filtered in excel

to remove noise and visualized with Excel and R. Dynamic complexes are identified statistically with

H/M ratios.

Similar to other high throughput genomic technologies, PCP-SILAC generates an

incredibly large amount of data in a single replicate (e.g., 66.3 GB of MS/MS data).

Creating software for analyzing this type of data is currently one of the bottlenecks of

the field, and was a major component in the development of the PCP-SILAC method.

The data analysis workflow of a PCP-SILAC experiment is depicted in Figure 3.2 with a

visual representation of generating protein profiles from quantitative mass

spectrometry data across multiple fractions and downstream preprocessing,

visualization, and statistical analysis.

Figure 3.2 PCP-SILAC data analysis workflow

35

Generated by plotting all protein profiles of one replicate with R.

3.1 Mapping a global interactome landscape with PCP-SILAC

Three independent biological replicates of PCP SILAC Salmonella resulted in

the identification and quantification of 4,049 human HeLa proteins and identification

of 21 Salmonella proteins at a 1.0% FDR. The PCP SILAC interactome landscape

reveals the complex topology of cytosolic protein complexes on a global scale, as

shown below in Figure 3.3 with all the protein profiles of one replicate plotted

together. Protein complexes are separated by size (the heavier complexes eluting

from the SEC column in the early fractions). Co-eluting proteins of a protein complex

give rise to a characteristic Gaussian peak that can be distinguished by its center,

width, and height. There are a handful of prominent peaks that represent the most

abundant cytosolic complexes (e.g. the proteasome centered around fraction 23).

The number of protein complexes a protein is associated with can be visualized by

plotting the protein profiles individually, as shown in Figure 3.4.

Figure 3.3 PCP-SILAC landscape interactome

36

Profiles of 40 randomly selected proteins from one replicate are plotted (M/L ratio y-axis and SEC

fraction x-axis). The number of peaks in a profile represent the number of protein complexes that

protein is associated with (as detected by PCP-SILAC).

Figure 3.4 Individual protein profiles

37

In the data analysis workflow, one of the early steps that required fine-tuning

was the filter step. During the PCP-SILAC Salmonella experiments, it was noted that

in one of the practice replicates, too much internal standard was added to the

fractions resulting in quantified ratios that were much lower than the threshold value

applied in the paper that developed SEC-PCP-SILAC method33. A range of minimum

threshold values (0.1-2.0) were evaluated, as judged by the number of protein

profiles that remained after the two filtering steps, as shown in Figure 3.5. It appears

that a 0.6 threshold was too stringent; therefore the optimal value is between 0.2-

0.4. The relative weight of the two filtering steps was calculated and diagramed in

Figure 3.5 B. With this dataset, filtering the data for a set of 3 consecutive data

points (mini clustering), had a much greater relative impact on the number of

proteins filtered out, when compared to the threshold step. The cluster filter step

can be adjusted in the future to see if using a less stringent cluster criteria (such as a

larger cluster with holes), would allow more profiles to be characterized.

Figure 3.5 Filtering protein profiles

A B

38

Hierarchically clustering the

protein profiles was one method to

visually investigate the protein

complexes. Many of the protein

complexes identified by this clustering

technique are curated in protein

interacton databases like CORUM. One

protein complex that was identified

from the clustering was the eukaryotic

translation initiation factor 3 protein

complex (eIF3). This complex has

been described by previous reports as

being a versatile scaffold for

translation initiation complexes and is

a known interactor of mTOR6. Proteins

that cluster near a protein complex

can be investigated further as a

candidate member of th at complex.

Figure 3.6 Heatmap of hierarchical clustered protein profiles of one replicate.

Fraction

39

Classifying protein complexes by hierarchically clustering protein profiles works

well for proteins that comigrate as only one protein complex and have a single

Gaussian peak. The method’s performance declines when proteins comigrate in more

than one protein complex because the clustering is based on similarities across the

whole chromatogram. For this reason, hierarchically clustering provides a limited

scope of the interactome captured in a PCP-SILAC experiment. To provide a more

global analysis of the interactome, an alternative approach for identifying protein

complexes on a per fraction basis was applied to the PCP-SILAC dataset.

The COMPLEAT tool performs protein complex enrichment analysis on

submitted proteomic datasets using a comprehensive resource of human protein

complexes. A p-value measure of significance is calculated for each protein complex

that is mapped from the dataset by comparing the score to a 1000 randomly

generated protein complexes of the same size. Using a p-value cutoff of 0.05 and

criteria of being in at least two of the three replicates, the PCP-SILAC dataset was

enriched with 346 protein complexes (the full list is provided in Appendix A). Of the

enriched protein complexes identified, $ were from literature sources and % are

predicted protein complexes.

3.1.1 Applying PCP-SILAC to a host-pathogen system

One of the benefits of SILAC is the simplicity of multiplexing an experiment and

quantitatively measuring dynamic changes of the proteomic interactome landscape.

In the Nature Methods paper of PCP-SILAC, Kristensen et al. demonstrated the ability

of PCP-SILAC to detect temporal shifts of protein architecture after EGF stimulation.

40

In this study we applied PCP-SILAC to study the initial host-pathogen

interactions of Salmonella with HeLa cells. Light, medium, and heavy SILAC labeled

cells were fully incorporated by growing the cells for five generations. Heavy labeled

cells were infected with WT Salmonella enterica at a MOI of 100 for 30 minutes at 37

degrees Celsius. Light, medium and heavy cells were immediately harvested and

protein complexes analyzed by PCP-SILAC. Identifying and characterizing the changes

between medium and heavy (infected vs. non-infected) cell populations was the focus

of the remainder of this thesis.

3.2 Identification of host proteins targeted during Salmonella

invasion

Three biological replicates of PCP-SILAC were performed testing Salmonella

infection versus noninfected conditions of two separate populations of labeled HeLa

cells. Before statistical analysis of control versus infected cells could be performed,

data from the replicates needed to be preprocessed.

41

"!

#!

$!

%!

&!

'!

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47

0

1

2

3

4

5

6

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47

"!

#!

$!

%!

&!

'!

#! %! '! (! )! ##!#%!#'!#(!#)!$#!$%!$'!$(!$)!%#!%%!%'!%(!%)!&#!&%!&'!&(!

"!

#!

$!

%!

&!

'!

#! %! '! (! )! ##!#%!#'!#(!#)!$#!$%!$'!$(!$)!%#!%%!%'!%(!%)!&#!&%!&'!

0

1

2

3

4

5

6

#! %! '! (! )! ##!#%!#'!#(!#)!$#!$%!$'!$(!$)!%#!%%!%'!%(!%)!&#!&%!&'!

"!

#!

$!

%!

&!

'!

#! %! '! (! )! ##!#%!#'!#(!#)!$#!$%!$'!$(!$)!%#!%%!%'!%(!%)!&#!&%!&'!

3.2.1 Preprocessing of Replicate Data

Protein profiles of the proteasome and 14-3-3 complex subunits were used to

align replicate data since they are abundant complexes of the cytosol and are well

annotated in the literature. Three alignments are shown below in Figure 3.7.

Figure 3.7 Aligning protein chromatograms with proteasome subunits

HC2

LMPX

PSC5

42

3.2.2 Identifying dynamic proteins (log fold change, z-test, t-test)

Filtered replicate data from the aligned protein profiles were processed with

three complementary techniques of identifying changing proteins in response to

Salmonella infection. In the first approach we set a fold change threshold of 1.5 for

the H/M ratios, which resulted in a list of 15 proteins. We next applied a Z-test

(location test) to the protein H/M ratios, based on the null hypothesis that host

proteins that were not dynamically affected by Salmonella infection would have the

same protein profiles in the two conditions, and a ratio of 1. The Z-test resulted in

the identification of 226 proteins with a q-value threshold of 0.05 shown below in

Figure 3.8 as a heatmap. The final approach was to fit linear models to the M/L and

H/L ratios and apply an empirical Bayes moderated t-test to the data with multiple

hypothesis testing accounted for by a FDR method. This led to six proteins being

flagged as significant meeting a q value threshold of 0.05. (Results of dynamic

proteins are listed on the following pages in Tables 3.1-3.3).

Figure 3.8 Heatmap of changing proteins identified by the z-test

43

Table 3.1 Identification of dynamic proteins after Salmonella infection (threshold log FC )

Gene Name IPI Identifier Protein Name H2BFD IPI00646240 Histone H2B

H2AFC IPI00291764 Histone H2A

H4/A IPI00453473 Histone H4

HNRNPH1 IPI00479191 Heterogenous nuclear

ribonucleoprotein BASP1 IPI00299024 Brain abundant, membrane

attached signal protein1 CDC47 IPI00291764 CDC47 Homolog

BM28 IPI00184330 DNA replication licensing factor

RPL23 IPI00010153 60S Ribosomal Protein

BAG3 IPI00641582 BAG family chaperone regulator

HK1 IPI00903226 Hexokinase1

CAP43 IPI00022078 N-myc downstream regulated 1

HSP60 IPI00784154 60 kDa heat shock protein

GCN1L1 IPI00001159 Translational activator GCN1

HSP27 IPI00025512 28 kDa heat shock protein

PTRF IPI00176903 Cavin-1 Polymerase I and

transcript release factor SEC Fraction

44

Table 3.2 Identification of dynamic proteins after Salmonella infection (z-test)

Gene Name

IPI Identifier Protein Name Function q-value

HSP27 IPI00025512 28 kDa heat shock protein Regulate Actin Dynamics 1.54 E-08

PAK2 IPI00419979 C-t-PAK2 Regulate Cytoskeleton Dynamics

4.31 E-02

ASNS IPI00306960 Asparagine—tRNA ligase Protein Translation 7.87 E-04

CAP1 IPI00939159 Adenylyl cyclase-associated protein1 Regulate Actin Dynamics 2.03 E-02

GCN1L1 IPI00001159 Translational activator GCN1 Protein Translation 3.30 E-04

RPL23 IPI00010153 60S ribosomal protein L17 Protein Translation 6.43 E-03

HSP60 IPI00784154 60 kDa heat shock protein Protein Folding 8.92 E -07

IQGAP1 IPI00009342 Ras GTPase-activating like protein Regulation of GTPase Activity 2.09 E -04

BASP1 IPI00299024 22 kDa neuronal tissue-enriched acidic protein Regulation of Transcription 2.15 E -06

AHNAK IPI00021812 Desmoyokin Regulate Actin Dynamics 7.28 E -09

AHNAK2 IPI00856045 Protein AHNAK2 Regulate Actin Dynamics 1.85 E -04

NHERF IPI00003527 Ezrin-radixin-moesin-binding phosphoprotein Regulate Actin Dynamics 4.71 E -04

EIFA IPI00025491 Eukaryotic initiation factor 4A Protein Translation 2.08 E -02

GLRX3 IPI00008552 PKC-interacting cousin of thioredoxin Cell Redox Homeostasis 8.34 E -04

CAPZA1 IPI00005969 F-actin capping protein Regulate Actin Dynamics 1.92 E -06

* Table continued in Appendix B

45

Table 3.3 Identification of dynamic proteins after Salmonella infection (t-test)

Overlap Gene Name Protein Name Function Log2 Fold Change (H/M)

q-value Reference

*** PTRF Cavin-1 Polymerase I and transcript release factor

Caveole biogenesis 2.13 1.03 E-06 52,53

** ASNS Asparagine—tRNA ligase

Protein Translation -1.74 5.59 E-03

*** HSP27 28 kDa heat shock protein

Regulate Actin Dynamics

-2.19 1.50 E-02 54

** PAK2 C-t-PAK2 Regulate Cytoskeleton Dynamics

-2.52 3.17 E-02 55,56

** CAP1 Adenylyl cyclase-associated protein1

Regulate Actin Dynamics

-1.86 4.42 E-02 57-59

* PSMB3 Proteasome Chain 13

Protein Degradation -1.36 3.51 E-03

* Indicates the number of tests the gene was identified as a dynamic protein.

46

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

0 5 10 15 20 25 30 35 40 45

All three methods detected 28 kDa heat shock protein (HSP27) as being

dynamic in response to Salmonella infection. The average M/L and H/L ratios of

HSP27 are plotted below in Figure 3.9. Comparing the two plots, there is a distinct

shift of the H/L peak center to the right of the M/L peak center, suggesting

disassembly or rearrangement of protein complexes that include HSP27 as a subunit.

A high proportion of HSP27’s fractions 16/46 were identified by the Z-test as

significantly changing (q-value < 0.0.5).

Figure 3.9 HSP27 PCP-SILAC profile

SEC Fraction

HSP27

log2

(Ra

tio

)

Avg M/L

Avg H/L

47

3.2.3 Integrating a phosphoproteomic Salmonella study

A previous study in our lab examined the dynamic changes of the

human host phosphoproteome as a time course experiment of early (1-30 min)

Salmonella Typhimurium infection. Phosphorylation is an important protein

modification for regulating the dynamic assembly and disassembly of protein

complexes, so we were interested in finding the overlap between the changing

proteins identified in the phosphoproteomic study and this study. The results of this

analysis (70 protein overlap) are shown below in Figure 3.12 as a Venn Diagram.

Figure 3.10 Phosphoproteomic and PCP-SILAC venn diagram

48

3.2.4 GO enrichment analysis

The DAVID functional analysis tool was used for GO enrichment analysis of the

dynamic proteins identified by the Z test. Although this list may contain relatively

more false positives than the other two methods, it has a higher sensitivity and

captures dynamic proteins to a greater depth. The list of candidates identified by the

Z test is large enough for GO analysis (the other candidate lists identified by fold

change and t-test were too small with 15 and 6 candidate proteins). The GO term

‘translation’ was enriched five fold more in Salmonella infected cells compared to the

background of proteins identified in the experiment. A second term that was

significantly enriched was ‘Negative regulation of ubiquitin-protein ligase activity’. A

bar graph of the enrichment analysis is shown above in Figure 3.11.

Figure 3.11 GO functional analysis of dynamic PCP-SILAC proteins with infection

49

3.3 Host protein synthesis machinery targeted during

Salmonella invasion

We were intrigued by the GO Analysis results as recent reports have found host

translational machinery being targeted in other host-pathogen systems but hasn’t

been reported in Salmonella. To test the hypothesis that host translational

machinery is targeted by Salmonella during infection, we measured protein synthesis

using a fluorescent assay as an alternative to the traditional radioactive methionine

approach.

In this assay, L-azidohomoalanine (AHA) acts as an analogue for methionine and

is incorporated into cells during active protein synthesis. After fixing cells, Click-it

chemistry is used to label AHA with a fluorescent marker (AlexaFluor488) to measure

nascent protein synthesis. The AHA assay was tested in HeLa cells using a range of

cycloheximide (inhibitor of protein synthesis) treatment concentrations. A dose

response curve along with corresponding images of green (AHA) and blue (nuclei)

channels of cycloheximide treated and untreated cells is shown below in Figure 3.12.

50

Figure 3.12 AHA protein synthesis assay

We next used the AHA assay to measure protein synthesis in HeLa cells infected

with red fluorescent protein (RFP) expressing Salmonella, WT. Based on three

biological replicates, we saw a significant decrease of protein synthesis of Salmonella

infected cells compared to the non-infected control (2-tailed T-test, p-value 0.01).

The PCP-SILAC Salmonella data identified cytosolic host protein complexes involved

with protein translation as being dynamic in response to Salmonella infection, and the

D 200 !M Cycloheximide

A

A. Chemical structure of L-methionine and L-azidohomoalanine. AHA Assay with HeLa cells, AHA (50 !M)

B. Dose response of cycloheximide, an inhibitor of protein translation C. Image of AHA fluorescence in

untreated HeLa cells and D. HeLa cells treated with 200 !M cycloheximide for 30 minutes.

!"

#!"

$!"

%!"

&!"

'!!"

'#!"

'$!"

()" ($" (*" (#" ('" !" '" #" *"

!"#$%&#'()

*#)+,*-'!.!'

/0&12'3-4/05#6,7,8#'9:;<'

'=0+#'$#+>0)+#'0?'4-4/05#6,7,8#''

C Untreated

B

51

results from the AHA protein synthesis experiments reveal that the host translational

machinery is being down regulated.

Table 3.4 Protein synthesis of Salmonella strains tested

Click-iT AHA Signal Intensity

Strains

R1 R2 R3

Average Std Dev T-test p-values

Control (No Salmonella)

355.8 359.6 407.7 374.4 28.94 WT vs Control

RFP WT SL1344 270.3 284.8 306.2 287.1 18.0 0.01 WT vs ! RFP "sopB SL1344 243.8 298.3 302.9 281.7 32.9 0.81 RFP "sptP SL1344 272.5 303.1 362.3 312.6 45.6 0.42 RFP "InvA SL1344 226.8 221.1 285.8 251.5 47.8 0.29 RFP "sopD SL1344 246.6 263.6 285.9 265.4 19.7 0.23 RFP "sipA "sopE "sopE2 SL1344

315.5 283.5 288.9 296.0 17.1 0.57

To further investigate our hypothesis that Salmonella targets host cell

translational machinery during infection, we tested five additional Salmonella strains

that had one or more SPI-1 effector(s) knocked out, testing a total of seven effectors.

All of the mutant knockout strains we tested did not show a significant difference

compared to wildtype (two tailed t-test with p-values greater than 0.05). This

suggests that protein synthesis is not being regulated by the SPI-1 effector knockouts

tested. Results of AHA protein synthesis experiments are listed in Table 3.4 and

plotted in Figure 3.14.

52

Figure 3.13 Protein synthesis in HeLa cells infected with WT Salmonella and effector

knockout strains

53

Chapter 4: Discussion

4.1 Current Salmonella-host interactome

The interface of evolving host-pathogen systems has typically been viewed

from a reductionist perspective. These type of reductionist studies have identified 62

Salmonella host protein interactions to support our current understanding of

Salmonella's complex pathogenesis60. High throughput technologies have allowed

scientists to capture global dynamics within the host during infection.18 The results of

PCP-SILAC Salmonella reported here make it the first global host-Salmonella protein

interactome study and provides novel insight into the shifts of cytosolic protein

complex architecture occurring in the host. PCP-SILAC is currently being adapted for

the analysis membrane protein complexes, which will fill in a critical component of

Salmonella’s host-pathogen interactome involved in cell signaling.

4.2 The identified dynamic host proteins during Salmonella

invasion

We have identified a subset of dynamic proteins from our PCP-SILAC dataset

that were significantly changing during Salmonella infection, suggesting the protein

complexes involving these proteins have functional roles during Salmonella invasion.

Since this is the first global host-pathogen interactome study of Salmonella, we

applied three different techniques to determine which proteins have dynamic protein

54

profiles. The overlap of candidate dynamic proteins is high between these

techniques, providing more confidence that the candidates are not false positives.

The candidate proteins identified by the T-test and overlapped with the other lists

are discussed in further detail below:

4.2.1 PTRF/Cavin-1

The plasma membrane of most mammalian cell types contains small (50-100

nm) flask-shape lipid invaginations known as caveole61. Caveole are a subclass of lipid

rafts that are enriched in cholesterol and typically contain caveolin, a hallmark

protein for caveole62. These dynamic membrane structures are hot spots for signaling

proteins and have many important trafficking functions including lipid storage,

endocytosis, and cell signaling63,64.

Caveolin was the first (and until recently only) marker for caveole. Further

investigation into the composition revealed another class of proteins associated with

caveole- the cavins52,53. PTRF/Cavin-1 is a cytosolic phosphoprotein originally named

after its role in regulating the activity of RNA transcription complexes, Polymerase I

Transcript Release Factor (PTRF). A dual function was demonstrated for PTRF/Cavin-

1 with its involvement in caveole biogenesis63,65.

Lipid rafts are often targeted by pathogens as a source to hijack host signaling

circuitry66. Previous work in our lab has shown that caveolin 1 is necessary for

Salmonella invasion in HeLa cells51 and similar findings were reported in endothelial M

cells67. Further support that Salmonella targets caveole to gain entry into the host is

provided by results here with PCP-SILAC, as PTRF/Cavin-1 was identified to have a

55

log2 fold change of 2.13 (interestingly, this was the only significant protein identified

with a positive fold change). Immunofluorescence studies suggest PTRF/Cavin-1

assembly with caveole is regulated by mitogenic serine/threonine kinase ARAF168, a

kinase shown to be regulated during Salmonella invasion18.

4.2.2 HSP27

Heat shock proteins (HSP) are ubiquitously expressed in all cell types and

function cooperatively in a network to maintain cellular balance during stressful

conditions. The chaperone activity of HSP helps prevent misfolded or denatured

proteins from aggregating (a problem associated with aging and neurodegenerative

diseases like Alzheimers). Small heat shock proteins (HSPB) are 12-42 kDa, and have a

conserved #-crystallin domain69.

HSPB1 (HSP27), a small heat shock protein, has a number of reported

regulatory roles including controlling the cellular redox state, protein folding, protein

degradation, cytoskeleton dynamics, and anti-apoptotic activity70. HSP27 is an

abundant protein in the cytosol and occurs in small complexes as dimers and large

complexes as oligomers, depending on cellular conditions. Structural dynamics of

HSP27 are regulated by the cellular redox and phosphorylation status of HSP2771,72.

Under heat stress, HSP27 can be translocated to the nucleus and cytoskeleton70.

Two recent Salmonella phosphoproteomic studies by Rogers and Imami focused

on global signaling dynamics of early and late stages of Salmonella infection in HeLa

cells. Imami reported that Salmonella directly targets HSP27 during late stages of

infection with SPI2 effector SteC (the only serine/threonine kinase encoded in the

56

Salmonella genome)54. In vitro and in vivo quantitative MS/MS experiments revealed

six sites on HSP27 dynamically phosphorylated by SteC, (S9, S15, S43/S49/S50/Y54,

S82, T174/S176, S199). Rogers Salmonella phosphorylation study of early infection

(timecourse of 2-30 min) identified two sites on HSP27 dynamically regulated (S15 and

S82) at 10 and 20 minutes18. The natural host kinase of HSP27 has been reported to

phosphorylate HSP27 at three sites (S15, S78, and S82). Additional multiply-

phosphorylation of HSP27 during late stages of Salmonella infection could be

explained by the host actin remodeling activity of HSP27. Confocal microscopy

provides support for this hypothesis, (actin condensation is seen in labeled F-actin

Salmonella SteC+ but not Salmonella SteC- infected cells)54.

A SILAC study investigating host proteins involved with Salmonella replication

during late stages of Salmonella infection reported that HSP27 was significantly

enriched in a Golgi fraction. The SCV has previously been observed to localize in the

Golgi region during SCV maturation, but the mechanism for SCV maturation remains

unkown73. PCP-SILAC protein profiles of HSP27 shows two peaks, a large broad peak

and a small sharp peak in the low MW region, possibly corresponding to different size

oligomers of HSP27. The protein profiles show a shift towards the lower MW species

during Salmonella infection, implying HSP27 disassembly, (Figure 3.9). This provides

the first structural evidence of HSP27 being targeted during early Salmonella

infection.

57

4.2.3 Cyclase Associated Protein-1

Dynamic changes of cell morphology are shaped by the underlying actin

cytoskeleton that forms a complex network within the cytosol. Rapid remodeling of

the actin cytoskeleton is a tightly controlled process regulated by actin binding

proteins that respond to internal and external stimuli. One family of actin binding

proteins that coordinate actin dynamics with cell signaling pathways are the highly

conserved adenylyl cyclase associated proteins (CAP). CAPs help regulate cell

polarity, cell motility, and endocytosis74,75.

In cells, actin is an ATPase that exists either in a monomeric globular state (G-

actin) or a polymeric filamentous state (F-actin). The rate-limiting steps of actin

disassembly are severing actin filaments and recycling of ADP-G-Actin to ATP-G-Actin.

Recent experiments show that CAP1 is a bifunctional protein that helps catalyze both

of these rate-limiting steps. CAP1 binds to actin in a 1:1 stoichiometry and self

associates to form a 600 kDa hexameric complex76. The C-terminal end of CAP1

recycles ADP-G-Actin to ATP-G-Actin while the N-terminal end enhancess cofilin-

mediated filament severing rates57-59.

A genome wide RNAi screen of host proteins affecting SopE mediated

Salmonella invasion identified CAP1 as one of their top hits that enhanced invasion

efficiency77. PCP-SILAC also identified CAP1 as a protein with dynamic protein

complexes during Salmonella invasion. Further investigations of CAP1 protein

complex dynamics will be necessary to understand how Salmonella manipulates the

actin cytoskeleton.

58

4.2.4 C-t-PAK2

The p21-activated kinases (PAKs) are a family of serine/threonine kinases that

control diverse biological processes including cytoskeleton dynamics and apoptosis.

They are present in the cytoplasm as homodimers in a trans-inhibited conformation

and become activated by external stimuli downstream of small GTPases, RAC and

CDC42. PAKs have a growing list of binding targets, making them a versatile class of

signaling enzymes55.

Pathogens have evolved strategies to rewire host-signaling networks by

mimicking host proteins with their effector proteins. Enterohaemorrhagic (EHEC)

Escherichia coli and Salmonella both use similar strategies to hijack host machinery

and invade the host cell. An EHEC E. coli host-pathogen study by Selyunin provided

structural evidence of a novel host-pathogen complex. E. coli uses a T3SS effector,

EspG, as a scaffold to recruit host signaling proteins PAK2 and ARF to the Golgi

apparatus, a subcellular location previously not associated PAK256. Several

Salmonella effectors are known to modulate the activity of CDC42 and RAC by

mimicking the regulators. Data from PCP-SILAC identified PAK2 as a dynamic protein

(log2 fold change of -2.52), suggesting Salmonella also fine-tunes cytoskeleton

dynamics during invasion by rewiring host signaling networks.

4.3 Host translation during Salmonella invasion

Cells are challenged with dynamic environments that require orchestrated

protein turnover in order to adapt to the changing conditions. Protein translation is

an essential process that allows the cell to generate any protein encoded in its

genome, including proteins for defense against invading pathogens. In response,

59

pathogens have evolved mechanisms to control host translation as a way to suppress

the host’s ability to combat the pathogen and divert more nutrients for intracellular

pathogen growth78.

Experimental and computational studies of Salmonella’s nutritional landscape

revealed that Salmonella can access at least 31 chemically diverse host nutrients for

growth79. Among these host nutrients are amino acids (e.g., arginine, lysine,

threonine, glutamate, and serine), which suggests that by inhibiting host protein

synthesis, Salmonella can gain more nutrients for growth.

Legionella pneumophilia is a Gram-negative bacterium that typically resides

inside amoebe, but can also inhabit mammalian lung tissue causing pneumonia and

Legionnaires disease. Similar to Salmonella, Legionella pneumophilia uses a

specialized secretion apparatus to translocate effector molecules into the host cytosol

during invasion. Host-pathogen studies of Legionella pnemophilia have reported a

global decrease in host translation as a result of five effectors that bind and modify

host elongation factors. They claim this is a strategy to suppress the host signaling

response80.

4.3.1 Protein synthesis during Salmonella invasion

The host translational machinery was identified by PCP-SILAC experiments to

be changing composition (assembling/disassembling) in response to early Salmonella

enterica serovar Typhimurium infection. Further experiments with a fluorescence-

based assay to measure protein synthesis confirmed that protein translation was

globally being suppressed during Salmonella (WT) infection. The same assay was also

used to test mutant Salmonella strains with six effector proteins knocked out. None

60

of the tested strains showed statistically significant change of protein synthesis

compared to WT, suggesting the tested effector knockouts are not responsible for the

host protein translation inhibition phenotype. Further experiments are needed to

determine the mechanism of protein translation inhibition of host from Salmonella

infection.

It should be noted there was a high variability in some of the assays. AHA Click-

it protein synthesis assay was developed for high content screening (HCS) with 96 well

plates. This allows the assay to be easily be scaled up to test thousands of different

compound treatments. This high throughput comes at a cost, and working at smaller

scales makes minor technical variation such as pipeting errors or dust particles falling

into the wells important to control. Variability of this assay could be greatly reduced

by using a robotic system for dispensing fluids (experiments performed in this thesis

were done manually with a multi-channel pipette). Sensitivity of this assay could also

be improved by increasing the concentration of AHA reagent.

4.3.2 HSP27 inhibits translation

HSP27 was previously described because it was one of our top candidates

identified by PCP-SILAC as being dynamic in response to Salmonella infection.

Phosphoproteomic studies also identified this protein as being regulated at both early

and late stages of Salmonella infection and its function in context to Salmonella

infection has mainly been attributed to regulating actin dynamics. Another role for

HSP27 during heat shock has been inhibition of protein translation by binding to eIF4G

and facilitating dissociation of cap-initiation complexes81. This multifunctional protein

61

could potentially offer more insight into how Salmonella targets host translation

during Salmonella infection.

62

Chapter 5: Conclusion

SEC-PCP-SILAC was applied to a host-pathogen system for the first time to

study dynamic protein interactions of cytosolic protein complexes during early

Salmonella infection. This global analysis was performed in triplicate profiling 4,049

human proteins from HeLa cells, representing 346 distinct protein complexes.

Proteins that were members of protein complexes that were changing in response to

stimuli were identified by three different methods leading to the hypothesis that host

translational machinery was being targeted by Salmonella during infection.

A fluorescent assay was used to measure protein synthesis of cells infected

with WT and effector knockout strains of Salmonella. There was a significant

decrease of host protein translation in cells infected with WT Salmonella versus the

non-infected control (2-tailed T-test, p-value 0.01) supporting our hypothesis that

Salmonella targets host translational machinery. The effector knockout strains tested

did not show a significant difference of protein translation compared to WT. Further

mechanistic studies will be needed to determine the effector(s) responsible for this

host response. This work provides a rich resource of candidate host proteins that may

be involved with Salmonella’s pathogenesis and provides the first snapshot of global

cytosolic protein complexes during Salmonella infection.

63

Bibliography

1. Hartwell LH, Hopfield JJ, Leibler S, Murray AW. From molecular to modular cell

biology. Nature. 1999;402(6761 Suppl):C47-52.

2. Han JJ, Bertin N, Hao T, et al. Evidence for dynamically organized modularity in

the yeast protein-protein interaction network. Nature. 2004;430(6995):88-93.

3. Fields S, Song O. A novel genetic system to detect protein-protein interactions.

Nature. 1989;340(6230):245-246.

4. Krogan NJ, Cagney G, Yu H, et al. Global landscape of protein complexes in the

yeast saccharomyces cerevisiae. Nature. 2006;440(7084):637-643.

5. Gavin A, Aloy P, Grandi P, et al. Proteome survey reveals modularity of the yeast

cell machinery. Nature. 2006;440(7084):631-636.

6. Smith T. The hog-cholera group of bacteria. U S Bur Anim Ind Bull. 1894;6:6-40.

7. Pui CF, Wong WC, Chai LC, et al. Salmonella: A foodborne pathogen. International

Food Research Journal. 2011;18:465-473.

8. Ramos-Morales F. Impact of salmonella enterica type III secretion system effectors

on the eukaryotic host cell. ISRN Cell Biol. 2012.

9. Miller S, Pegues D. SalmonellaSpecies, IncludingSalmonellaTyphi. In: Mandell,

douglas, and bennett’s principles and practice of infectious diseases. Seventh ed.

Churchill Livingstone; 2009:2636-650.

10. Ohl ME, Miller SI. Salmonella: A model for bacterial pathogenesis. Annu Rev Med.

2001;52(1):259-274.

64

11. Haraga A, Ohlson MB, Miller SI. Salmonellae interplay with host cells. Nat Rev

Microbiol. 2008;6(1):53-66.

12. Jones BD. Host responses to pathogenic salmonella infection. Genes Dev.

1997;11(6):679-687.

13. Winter SE, Thiennimitr P, Winter MG, et al. Gut inflammation provides a

respiratory electron acceptor for salmonella. Nature. 2010;467(7314):426-429.

14. Thiennimitr P, Winter SE, Baumler AJ. Salmonella, the host and its microbiota.

Curr Opin Microbiol. 2012;15(1):108-114.

15. Porwollik S. Salmonella :From genome to function. Wymondham: Caister

Academic Press; 2011:300, A1.

16. Schlumberger MC, Müller AJ, Ehrbar K, et al. Real-time imaging of type III

secretion: Salmonella SipA injection into host cells. Proceedings of the National

Academy of Sciences of the United States of America. 2005;102(35):12548-12553.

17. Reis R, Horn F. Enteropathogenic escherichia coli, samonella, shigella and

yersinia: Cellular aspects of host-bacteria interactions in enteric diseases. Gut

Pathogens. 2010;2(1):8.

18. Rogers LD, Brown NF, Fang Y, Pelech S, Foster LJ. Phosphoproteomic analysis of

salmonella-infected cells identifies key kinase regulators and SopB-dependent host

phosphorylation events. Science signaling. 2011;4(191):rs9.

19. Wasinger VC, Cordwell SJ, Cerpa-Poljak A, et al. Progress with gene-product

mapping of the mollicutes: Mycoplasma genitalium. Electrophoresis. 1995;16(7):1090-

1094.

65

20. de Hoog CL, Mann M. Proteomics. Annu Rev Genomics Hum Genet. 2004;5:267-

293.

21. Canas B, Lopez-Ferrer D, Ramos-Fernandez A, Camafeita E, Calvo E. Mass

spectrometry technologies for proteomics. Brief Funct Genomic Proteomic.

2006;4(4):295-320.

22. Aebersold R, Mann M. Mass spectrometry-based proteomics. Nature.

2003;422(6928):198-207.

23. Schulze WX, Usadel B. Quantitation in mass-spectrometry-based proteomics. Annu

Rev Plant Biol. 2010;61:491-516.

24. Ishihama Y, Schmidt T, Rappsilber J, et al. Protein abundance profiling of the

escherichia coli cytosol. BMC Genomics. 2008;9:102-2164-9-102.

25. Pickup JF, McPherson K. Theoretical considerations in stable isotope dilution mass

spectrometry for organic analysis. Anal Chem. 1976;48(13):1885-1890.

26. Ong SE, Blagoev B, Kratchmarova I, et al. Stable isotope labeling by amino acids in

cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol

Cell Proteomics. 2002;1(5):376-386.

27. Rigaut G, Shevchenko A, Rutz B, Wilm M, Mann M, Seraphin B. A generic protein

purification method for protein complex characterization and proteome exploration.

Nat Biotechnol. 1999;17(10):1030-1032.

28. Gavin AC, Maeda K, Kuhner S. Recent advances in charting protein-protein

interaction: Mass spectrometry-based approaches. Curr Opin Biotechnol.

2011;22(1):42-49.

66

29. Collins MO, Choudhary JS. Mapping multiprotein complexes by affinity purification

and mass spectrometry. Curr Opin Biotechnol. 2008;19(4):324-330.

30. Dong M, Yang LL, Williams K, et al. A "tagless" strategy for identification of stable

protein complexes genome-wide by multidimensional orthogonal chromatographic

separation and iTRAQ reagent tracking. J Proteome Res. 2008;7(5):1836-1849.

31. Havugimana PC, Hart GT, Nepusz T, et al. A census of human soluble protein

complexes. Cell. 2012;150(5):1068-1081.

32. Heide H, Bleier L, Steger M, et al. Complexome profiling identifies TMEM126B as a

component of the mitochondrial complex I assembly complex. Cell Metab.

2012;16(4):538-549.

33. Kristensen AR, Gsponer J, Foster LJ. A high-throughput approach for measuring

temporal changes in the interactome. Nat Methods. 2012;9(9):907-909.

34. Kirkwood KJ, Ahmad Y, Larance M, Lamond AI. Characterisation of native protein

complexes and protein isoform variation using size-fractionation based quantitative

proteomics. Mol Cell Proteomics. 2013.

35. Klockenbusch C, Kast J. Optimization of formaldehyde cross-linking for protein

interaction analysis of non-tagged integrin beta1. J Biomed Biotechnol.

2010;2010:927585.

36. Zhang H, Tang X, Munske GR, Tolic N, Anderson GA, Bruce JE. Identification of

protein-protein interactions and topologies in living cells with chemical cross-linking

and mass spectrometry. Mol Cell Proteomics. 2009;8(3):409-420.

67

37. Ruepp A, Waegele B, Lechner M, et al. CORUM: The comprehensive resource of

mammalian protein complexes--2009. Nucleic Acids Res. 2010;38(Database

issue):D497-501.

38. Yang W, Steen H, Freeman MR. Proteomic approaches to the analysis of

multiprotein signaling complexes. Proteomics. 2008;8(4):832-851.

39. Kühner S, van Noort V, Betts MJ, et al. Proteome organization in a genome-

reduced bacterium. Science. 2009;326(5957):1235-1240.

40. Ong SE, Kratchmarova I, Mann M. Properties of 13C-substituted arginine in stable

isotope labeling by amino acids in cell culture (SILAC). J Proteome Res. 2003;2(2):173-

181.

41. Rappsilber J, Mann M, Ishihama Y. Protocol for micro-purification, enrichment,

pre-fractionation and storage of peptides for proteomics using StageTips. Nat

Protocols. 2007;2(8):1896-1906.

42. Cox J, Mann M. MaxQuant enables high peptide identification rates, individualized

p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat

Biotechnol. 2008;26(12):1367-1372.

43. Cox J, Neuhauser N, Michalski A, Scheltema RA, Olsen JV, Mann M. Andromeda: A

peptide search engine integrated into the MaxQuant environment. J Proteome Res.

2011;10(4):1794-1805.

44. Elias JE, Gygi SP. Target-decoy search strategy for increased confidence in large-

scale protein identifications by mass spectrometry. Nat Meth. 2007;4(3):207-214.

45. Vinayagam A, Hu Y, Kulkarni M, et al. Protein complex-based analysis framework

for high-throughput data sets. Sci Signal. 2013;6(264):rs5.

68

46. Smyth GK. Linear models and empirical bayes methods for assessing differential

expression in microarray experiments. Stat Appl Genet Mol Biol. 2004;3:Article3.

47. Storey JD, Tibshirani R. Statistical significance for genomewide studies.

Proceedings of the National Academy of Sciences. 2003;100(16):9440-9445.

48. Huang da W, Sherman BT, Lempicki RA. Bioinformatics enrichment tools: Paths

toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res.

2009;37(1):1-13.

49. Huang da W, Sherman BT, Lempicki RA. Systematic and integrative analysis of

large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009;4(1):44-57.

50. Boyle EC, Brown NF, Finlay BB. Salmonella enterica serovar typhimurium effectors

SopB, SopE, SopE2 and SipA disrupt tight junction structure and function. Cell

Microbiol. 2006;8(12):1946-1957.

51. Zheng Y. Dynamic composition of membrane microdomains. [Doctoral

dissertation]. University of British Columbia; 2012.

52. Aboulaich N, Vainonen JP, StrÃ¥lfors P, Vener AV. Vectorial proteomics reveal

targeting, phosphorylation and specific fragmentation of polymerase I and transcript

release factor (PTRF) at the surface of caveolae in human adipocytes. Biochem J.

2004;383(2):237-248.

53. Hill MM, Bastiani M, Luetterforst R, et al. PTRF-cavin, a conserved cytoplasmic

protein required for caveola formation and function. Cell. 2008;132(1):113-124.

54. Imami K, Bhavsar AP, Yu H, et al. Global impact of salmonella pathogenicity island

2-secreted effectors on the host phosphoproteome. Mol Cell Proteomics.

2013;12(6):1632-1643.

69

55. Bokoch GM. BIOLOGY OF THE P21-ACTIVATED KINASES. Annu Rev Biochem.

2003;72(1):743-781.

56. Selyunin AS, Sutton SE, Weigele BA, et al. The assembly of a GTPase-kinase

signalling complex by a bacterial catalytic scaffold. Nature. 2011;469(7328):107-111.

57. Balcer HI, Goodman AL, Rodal AA, et al. Coordinated regulation of actin filament

turnover by a high-molecular-weight Srv2/CAP complex, cofilin, profilin, and Aip1.

Curr Biol. 2003;13(24):2159-2169.

58. Chaudhry F, Breitsprecher D, Little K, Sharov G, Sokolova O, Goode BL.

Srv2/cyclase-associated protein forms hexameric shurikens that directly catalyze

actin filament severing by cofilin. Mol Biol Cell. 2013;24(1):31-41.

59. Brieher W. Mechanisms of actin disassembly. Mol Biol Cell. 2013;24(15):2299-

2302.

60. Schleker S, Sun J, Raghavan B, et al. The current salmonella-host interactome.

Proteomics Clin Appl. 2012;6(1-2):117-133.

61. Galbiati F, Razani B, Lisanti MP. Emerging themes in lipid rafts and caveolae. Cell.

2001;106(4):403-411.

62. Matveev S, Li X, Everson W, Smart EJ. The role of caveolae and caveolin in

vesicle-dependent and vesicle-independent trafficking. Adv Drug Deliv Rev.

2001;49(3):237-250.

63. Briand N, Dugail I, Le Lay S. Cavin proteins: New players in the caveolae field.

Biochimie. 2011;93(1):71-77.

64. Hayer A, Stoeber M, Bissig C, Helenius A. Biogenesis of caveolae: Stepwise

assembly of large caveolin and cavin complexes. Traffic. 2010;11(3):361-382.

70

65. Chadda R, Mayor S. PTRF triggers a cave in. Cell. 2008;132(1):23-24.

66. Manes S, del Real G, Martinez-A C. Pathogens: Raft hijackers. Nat Rev Immunol.

2003;3(7):557-568.

67. Lim JS, Choy HE, Park SC, Han JM, Jang IS, Cho KA. Caveolae-mediated entry of

salmonella typhimurium into senescent nonphagocytotic host cells. Aging Cell.

2010;9(2):243-251.

68. Pelkmans L, Zerial M. Kinase-regulated quantal assemblies and kiss-and-run

recycling of caveolae. Nature. 2005;436(7047):128-133.

69. Aquilina JA, Shrestha S, Morris AM, Ecroyd H. Structural and functional aspects of

hetero-oligomers formed by the small heat shock proteins alphaB-crystallin and

HSP27. J Biol Chem. 2013;288(19):13602-13609.

70. Mymrikov EV, Seit-Nebi AS, Gusev NB. Large potentials of small heat shock

proteins. Physiol Rev. 2011;91(4):1123-1159.

71. Garrido C. Size matters: Of the small HSP27 and its large oligomers. Cell Death

Differ. 2002;9(5):483-485.

72. Mymrikov EV, Seit-Nebi AS, Gusev NB. Heterooligomeric complexes of human small

heat shock proteins. Cell Stress Chaperones. 2012;17(2):157-169.

73. Vogels MW, van Balkom BW, Heck AJ, et al. Quantitative proteomic identification

of host factors involved in the salmonella typhimurium infection cycle. Proteomics.

2011;11(23):4477-4491.

74. Lee S, Dominguez R. Regulation of actin cytoskeleton dynamics in cells. Mol Cells.

2010;29(4):311-325.

71

75. Hubberstey AV, Mottillo EP. Cyclase-associated proteins: CAPacity for linking

signal transduction and actin polymerization. FASEB J. 2002;16(6):487-499.

76. Quintero-Monzon O, Jonasson EM, Bertling E, et al. Reconstitution and dissection

of the 600-kDa Srv2/CAP complex: Roles for oligomerization and cofilin-actin binding

in driving actin turnover. J Biol Chem. 2009;284(16):10923-10934.

77. Misselwitz B, Dilling S, Vonaesch P, et al. RNAi screen of salmonella invasion

shows role of COPI in membrane targeting of cholesterol and Cdc42. Mol Syst Biol.

2011;7.

78. Mohr I, Sonenberg N. Host translation at the nexus of infection and immunity. Cell

Host & Microbe. 2012;12(4):470-483.

79. Steeb B, Claudi B, Burton NA, et al. Parallel exploitation of diverse host nutrients

EnhancesSalmonellaVirulence. PLoS Pathog. 2013;9(4):e1003301.

80. Fontana MF, Banga S, Barry KC, et al. Secreted bacterial effectors that inhibit

host protein synthesis are critical for induction of the innate immune response to

virulent legionella pneumophila. PLoS Pathog. 2011;7(2):e1001289.

81. Cuesta R, Laroia G, Schneider RJ. Chaperone hsp27 inhibits translation during heat

shock by binding eIF4G and facilitating dissociation of cap-initiation complexes. Genes

Dev. 2000;14(12):1460-1470.

82. Szklarczyk D, Franceschini A, Kuhn M, et al. The STRING database in 2011:

Functional interaction networks of proteins, globally integrated and scored. Nucleic

Acids Res. 2011;39(Database issue):D561-8.

72

Appendices

Appendix A List of Protein Complexes Enriched in PCP-SILAC dataset

Table 5.1 Protein Complex Enrichment Analysis Results

Complex

ID Size Type Complex Name Gene IDs

HC1180 3 Literature CAND1-CUL4A-RBX1 complex 8451 9978 55832

HC1459 3 Literature Ubiquitin E3 ligase (DDB1, CUL4A, RBX1) 8451 9978 1642

HC1790 3 Literature EIF3 complex (EIF3B, EIF3J, EIF3I) 8662 8668 8669

HC198 3 Literature RFC core complex 5984 5985 5982

HC2064 3 Literature ubiquitin-dependent protein catabolic process 8451 8450 79016

HC2542 3 Literature methionyl glutamyl tRNA synthetase complex 2058 4141 9255

HC2579 3 Literature p54(nrb)-PSF-matrin3 complex 4841 9782 6421

HC2611 3 Literature HSP90-CDC37-LRRK2 complex 3326 120892 11140

HC3292 3 Literature putative complex without known function 56647 9349 84193

HC4 3 Literature TOP1-PSF-P54 complex 4841 6421 7150

HC465 3 Literature COG5-COG6-COG7 subcomplex 57511 91949 10466

HC609 3 Literature CAND1-CUL4B-RBX1 complex 9978 8450 55832

HC794 3 Literature APC-IQGAP1-Cdc42 complex 324 998 8826

HC92 3 Literature COG2-COG3-COG4 subcomplex 83548 25839 22796

HC1035 4 Literature Tetrameric COG subcomplex 57511 84342 91949 10466

HC1645 4 Literature EIF3 core complex (EIF3A, EIF3B, EIF3G, EIF3I) 8662 8668 8666 8661

HC2090 4 Literature CDC37-HSP90AA1-HSP90AB1-MAP3K11 complex 3320 3326 4296 11140

HC2237 4 Literature Cul4B-RING ubiquitin ligase complex 51514 9978 1642 8450

HC2245 4 Literature glycogen catabolic process 5836 64210 5837 5834

HC2575 4 Literature actin filament capping 830 832 829 93661 Complex


73

HC3024 4 Literature protein K48-linked ubiquitination 8454 997 54926 8453

HC3108 4 Literature Ubiquitin E3 ligase (CDT1, DDB1, CUL4A, RBX1) 8451 9978 1642 81620

HC5426 4 Predicted SRP-dependent cotranslational protein targeting to membrane 682 51027 6155 6129

HC763 4 Literature 9S-cytosolic aryl hydrocarbon (Ah) receptor non-ligand activated complex 3320 196 3326 9049

HC9556 4 Predicted COPI coating of Golgi vesicle 22908 375 9276 11316

HC9836 4 Predicted putative complex without known function 56647 832 50618 5764

HC1088 5 Literature ribonucleoside monophosphate biosynthetic process 5636 5635 5631 5634 221823

HC1542 5 Literature eIF2B 8892 8893 8890 8891 1967

HC1815 5 Literature EIF3 complex (EIF3A, EIF3B, EIF3G, EIF3I, EIF3C) 8662 8663 8668 8666 8661

HC2450 5 Literature COG1-COG8-COG5-COG6-COG7 subcomplex 9382 57511 84342 91949 10466

HC2673 5 Literature tRNA-splicing ligase complex 51637 79074 283742 1653 51493

HC2832 5 Literature PCNA-RFC2-5 complex 5984 5111 5983 5985 5982

HC2851 5 Literature Ubiquitin E3 ligase (DDB1, DDB2, CUL4A, CUL4B, RBX1) 1643 8451 9978 1642 8450

HC2930 5 Literature EIF3 complex (EIF3A, EIF3B, EIF3G, EIF3I, EIF3J) 8662 8668 8666 8669 8661

HC3056 5 Literature COPI coating of Golgi vesicle 200205 55510 1314 9276 168400

HC3061 5 Literature Ubiquitin E3 ligase (DET1, DDB1, CUL4A, RBX1, COP1) 64326 8451 55070 9978 1642

HC3813 5 Predicted putative complex without known function 10484 7534 10539 57019 65220

HC49 5 Literature putative complex without known function 25940 147965 51637 283742 1653

HC5736 5 Predicted nucleotide-binding domain, leucine rich repeat containing receptor signaling pathway 1457 10808 5970 3326 7316

HC5962 5 Predicted M/G1 transition of mitotic cell cycle 7431 60 310 7316 6117

HC6954 5 Predicted RNA processing 1655 3187 56916 6613 10521

HC7 5 Literature COP9 signalosome complex (CSN) 10920 9318 2873 2671 10987

HC7079 5 Predicted translational elongation 10368 1937 10598 1891 1936



HC8050 5 Predicted G2/M transition of mitotic cell cycle 9857 1780 10121 10540 55860

HC900 5 Literature COG1-COG8-COG2-COG3-COG4 subcomplex 9382 83548 25839 22796 84342


HC99 5 Literature RFC complex (activator A 1 complex) 5984 5981 5983 5985 5982

HC101 6 Literature PCNA-CHL12-RFC2-5 complex 5984 5111 5983 5985 5982 63922

HC1573 6 Literature dynactin complex 10671 11258 10120 10121 10540 55860

HC1860 6 Literature BRD4-RFC complex 5984 5981 5983 23476 5985 5982 Complex


HC2037 6 Literature alpha DNA polymerase:primase complex 23649 5558 5422 92797 4172 5557

HC2479 6 Literature eukaryotic translation initiation factor 2B complex 1965 8892 8893 8890 8891 1967

74

HC305 6 Literature cullin deneddylation 64708 9318 10980 50813 51138 8533

HC3194 6 Literature DNA replication factor C complex 5984 5111 5981 5983 5985 5982

HC3233 6 Literature putative complex without known function 90861 10539 654483 57019 51155 552900

HC3486 6 Literature Ubiquitin E3 ligase (AHR, ARNT, DDB1, TBL3, CUL4B, RBX1) 9978 196 405 10607 1642 8450

HC3991 6 Predicted termination of G-protein coupled receptor signaling pathway 10971 2869 7531 6011 801 5957

HC4021 6 Predicted RNA splicing 22938 22803 23435 9768 5394 988

HC4171 6 Predicted protein targeting 2885 7534 324 6711 7316 7532

HC4303 6 Predicted nucleobase-containing compound metabolic process 56647 1806 9349 84193 5873 7316

HC4332 6 Predicted nuclear-transcribed mRNA catabolic process, nonsense-mediated decay 56647 9349 84193 7316 84305 6194

HC4376 6 Predicted oligodendrocyte development 8892 1965 8894 8890 8891 55572

HC5265 6 Predicted RNA biosynthetic process 2778 6154 409 9584 988 6207

HC5507 6 Predicted nucleobase-containing compound metabolic process 56647 51552 1806 9349 84193 7316

HC569 6 Literature dynactin complex 10120 9857 10121 10540 25999 55860

HC5950 6 Predicted RNA metabolic process 3178 1655 56252 7916 202559 10521

HC6043 6 Predicted putative complex without known function 56647 1806 9349 326624 84193 7316


HC7447 6 Predicted endosome transport 4733 790 51389 1819 55681 7316

HC7559 6 Predicted nuclear mRNA splicing, via spliceosome 3190 3178 140890 7001 8761 6421

HC764 6 Literature MCM complex 4173 4175 4171 4176 4174 4172

HC7809 6 Predicted ribonucleoprotein complex assembly 11218 85015 158747 1653 23603 7316

HC7861 6 Predicted nuclear mRNA splicing, via spliceosome 3190 3178 140890 6428 26986 8761


HC829 6 Literature BASC (Ab 81) complex (BRCA1-associated genome surveillance complex) 472 5984 672 4292 5981 5982

HC8495 6 Predicted ribonucleoside monophosphate biosynthetic process 23065 51643 5635 5631 7316 5634

HC9199 6 Predicted axon guidance 10963 3320 10728 3326 9049 7316

HC9301 6 Predicted positive regulation of proteasomal ubiquitin-dependent protein catabolic process 10210 4841 6613 6421 7341 7150

HC9364 6 Predicted translational elongation 1460 10728 1937 1917 1936 1933

HC9503 6 Predicted RNA processing 3178 51253 56252 8106 10521 3276

HC1260 7 Literature PALS1-Par3-aPKC-14-3-3 zeta complex 7534 64398 7531 117583 7529 56288 5584

HC1602 7 Literature Mcm2-7 4173 4175 4171 4176 4174 254394 4172

Complex


HC1605 7 Literature SCF-CDC4 complex 6500 8454 10910 55294 997 54926 8453

HC1870 7 Literature translational elongation 81570 10985 1937 1915 1917 1936 1933

HC1887 7 Literature positive regulation of cell cycle arrest 5527 5529 5526 5525 5528 51629 55972

75

HC2053 7 Literature Coatomer complex 22818 1314 372 1315 9276 11316 22820

HC2161 7 Literature transcription-coupled nucleotide-excision repair 5984 5981 5983 79915 5985 5982 63922

HC2549 7 Literature Pol epsilon 1655 55510 79009 5426 1662 10521 168400

HC2722 7 Literature DNA unwinding involved in replication 4173 4175 4171 4176 79892 4174 4172

HC323 7 Literature emerin C24 4173 79595 4175 2010 3192 4171 708

HC380 7 Literature Cul4A-RING ubiquitin ligase complex 51514 1161 8451 9978 1642 26133 51185

HC4104 7 Predicted mRNA metabolic process 6193 56647 1806 9349 6125 84193 7316

HC5431 7 Predicted actin filament capping 60 5521 832 11344 4703 829 4131

HC5467 7 Predicted RNA processing 3181 3178 1655 56252 202559 10521 7150

HC5564 7 Predicted COPI coating of Golgi vesicle 22938 1314 372 7316 1315 9276 988

HC5782 7 Predicted positive regulation of transcription from RNA polymerase II promoter 2033 1387 1655 8202 8648 10499 10521

HC5832 7 Predicted cellular amino acid catabolic process 4733 790 501 51389 1819 158078 7316

HC596 7 Literature RC complex 5984 5981 5558 5985 5982 5422 5557

HC6198 7 Predicted regulation of Ras protein signal transduction 9411 998 2286 5911 7316 10564 10565



HC7564 7 Predicted nucleotide biosynthetic process 5636 5598 5635 5631 5723 7316 5634

HC7577 7 Predicted ubiquitin-dependent protein catabolic process 4738 8451 10980 9978 10987 8450 1642

HC7626 7 Predicted nucleobase-containing compound metabolic process 56647 9945 1806 9349 84193 7316 55505

HC7934 7 Predicted neuron development 2932 10413 1500 999 1499 5663 1495

HC8455 7 Predicted nerve growth factor receptor signaling pathway 7534 51727 4303 7249 9759 7531 4140


HC9355 7 Predicted regulation of protein serine/threonine kinase activity 10963 3320 6885 5536 3326 7316 11140

HC9403 7 Predicted hippocampus development 7534 7248 2308 55711 9759 7531 4140

HC1131 8 Literature CCT complex (chaperonin containing TCP1 complex) 7203 908 22948 10576 10574 6950 10694 10575

HC2544 8 Literature Coatomer-Arf1 complex 22818 1314 11316 375 372 1315 9276 22820

HC4343 8 Predicted protein ubiquitination 51514 8451 9978 79016 8450 80344 1642 55832

HC4525 8 Predicted tRNA aminoacylation for protein translation 2058 3735 55613 5859 9255 7965 66036 1506

Complex


HC494 8 Literature Conserved oligomeric Golgi (COG) complex 9382 83548 25839 57511 22796 84342 91949 10466

HC5054 8 Predicted cullin deneddylation 8454 10920 10980 51138 9318 10987 8533 7316



HC7137 8 Predicted ubiquitin-dependent protein catabolic process 8451 8882 9978 8450 3312 80344 1642 55832

76

HC8321 8 Predicted transmembrane receptor protein tyrosine kinase signaling pathway 3667 8503 7534 2316 3480 7531 7532 9846

HC1356 9 Literature BRAF-RAF1-14-3-3 complex 673 5894 10971 7534 7531 7533 2810 7532 7529

HC1360 9 Literature regulation of translational initiation in response to stress 8893 8894 1968 8872 1967 1965 8892 8890 8891

HC1442 9 Literature tRNA aminoacylation for protein translation 2058 5917 3376 5859 1615 4141 9255 7965 9521

HC2495 9 Literature CCT complex (chaperonin containing TCP1 complex), testis specific 7203 908 22948 10576 10574 6950 10693 10694 10575

HC3100 9 Literature COP9 signalosome complex 10920 9318 50813 51138 8533 10987 64708 10980 2873

HC4835 9 Predicted cullin deneddylation 51138 9318 8533 2516 7534 64708 10980 2873 1642


HC7099 9 Predicted insulin receptor signaling pathway 8503 3667 7534 7531 2308 3480 7249 7532 9846

HC8974 9 Predicted tRNA aminoacylation for protein translation 5917 4141 2058 3735 9255 3376 7965 5859 1615


HC2308 10 Literature chaperonin-containing T-complex 22948 10576 6950 908 7203 10574 10693 150160 10694 10575

HC5632 10 Predicted protein ubiquitination 26043 51514 8451 9978 79016 8450 80344 8452 1642 55832

HC5640 10 Predicted protein ubiquitination 26043 8451 6923 9978 7428 8453 8450 122769 1642 55832

HC7047 10 Predicted cullin deneddylation 51138 9318 9978 8533 8453 8452 8454 64708 10980 2873

HC91 10 Literature L2DTL 51514 3308 5111 9318 51138 8533 10987 10980 2873 1642

HC975 10 Literature Golgi transport complex 23256 25839 57511 22796 84342 91949 2802 9382 83548 10466

HC9806 10 Predicted protein targeting 3799 3831 10971 7534 7531 7533 7532 23367 7316 7529

HC1026 11 Literature eIF3 8663 8668 23277 6294 8669 728689 8662 9667 8666 79811 8661

HC1649 11 Literature Ksr1 complex (Ksr1, Mek, 14-3-3), unstimulated 8844 5604 283455 10971 7534 7531 7533 5605 7532 7529 2810

HC2402 11 Literature COPI 22818 51226 1314 11316 26286 372 26958 22820 9276 1315 84364

HC2446 11 Literature B-Ksr1-MEK-MAPK-14-3-3 complex 8844 5604 283455 10971 7534 7531 5594 7533 5605 7532 7529

HC2675 11 Literature Arp2/3 protein complex 81873 653857 10095 10092 10096 10093 10097 10109 10552 57180 10094

Complex


HC4267 11 Predicted cullin deneddylation 51138 9318 9978 8533 8453 8450 8454 64708 10980 2873 1642

HC5608 11 Predicted transcription-coupled nucleotide-excision repair 9125 5984 23476 5111 5983 5985 142 2547 5981 5982 988

HC7230 11 Predicted COPI coating of Golgi vesicle 22938 1314 10972 11316 54732 372 9276 22820 1315 7316 988

HC953 11 Literature Multisynthetase complex 2058 3735 5917 51520 3376 5859 1615 4141 9255 7965 9521

HC3651 12 Predicted tRNA aminoacylation for protein translation 5917 2058 3735 10492 3376 51528 5859 1615 57520 4141 9255 7965

HC5280 12 Predicted tRNA aminoacylation for protein translation 6723 5917 2058 3735 3376 5859 1615 4141 9255 832 7965

77

124944

HC6407 12 Predicted DNA damage response, signal transduction by p53 class mediator resulting in cell cycle arrest

22938 5708 5707 5719 5706 51377 5701 5705 5700 7316 988 5704

HC7815 12 Predicted tRNA aminoacylation for protein translation 5917 2058 3735 4800 3376 5859 1615 153443 4141 9255 7965 708

HC3121 13 Literature CSA complex 10920 1161 8451 9318 50813 51138 9978 8533 10987 64708 10980 2873 1642

HC3598 13 Literature DDB2 complex 10920 1643 8451 9318 50813 51138 9978 8533 10987 64708 10980 2873 1642

HC3926 13 Predicted SRP-dependent cotranslational protein targeting to membrane 6130 5036 6154 79877 6139 645683 6155 6132 6129 6142 6152 6135 25873




HC2582 14 Literature CSA-POLIIa complex 10920 1161 8451 9318 50813 51138 9978 8533 10987 64708 10980 2873 5430 1642

HC3679 14 Predicted ubiquitin-dependent protein catabolic process 8451 51138 9318 9978 8533 253832 8453 8450 8452 8454 64708 10980 1642 55832

HC3985 14 Predicted SRP-dependent cotranslational protein targeting to membrane 6137 51042 5036 6154 11224 6139 6208 6155 6132 166378 6152 6124 6135 25873



HC4331 14 Predicted translational initiation 1981 8668 8663 5313 9960 8669 8665 8662 10289 8666 8667 8661 7316 4189



5707 5719 6477 5706 51377 9908 5701 10213 5713 5705 9861 83940 5700 5704


5708 5707 5719 55005 5706 51377 5702 5713 5705 9861 5715 5700 31 5704

Complex








78

HC8029 14 Predicted cullin deneddylation 10920 8451 8065 50813 51138 9318 56254 10987 8533 8454 10980 2873 7316 1642

HC8600 14 Predicted translational initiation 6059 27335 1981 51386 8668 8669 8664 1974 10480 8665 8662 10289 8666 8667








5708 9868 5707 5684 5719 4677 5706 5709 64747 5702 5701 10999 5705 5700

HC3669 15 Predicted ubiquitin-dependent protein catabolic process 4738 26043 8065 8451 9978 10987 8453 8450 54165 8452 8454 10980 9040 27231 1642


5708 5707 5719 5706 51377 5688 10213 5713 5682 5705 9861 55768 5700 7415 5704


5707 5719 5706 51377 5709 10213 5713 26003 5717 5705 9861 5700 5718 5704 6189

HC3789 15 Predicted SRP-dependent cotranslational protein targeting to membrane 6159 54606 6154 6181 6161 6201 6139 6160 6144 6128 6173 6142 6157 25873 6147


56893 6184 5707 79152 5719 7979 5701 5713 6782 5710 5705 29978 5700 5886 11047


6301 5707 5719 5706 471 7979 10213 5713 5716 9861 9097 5710 9987 5700 7316


6517 5707 5719 5706 5714 5701 5713 5716 9861 9097 5705 80227 5700 5704 60681


Complex



6184 5707 5719 5706 51377 5701 10213 5713 6185 776 5705 9861 5700 5711 5704




HC4027 15 Predicted translational initiation 1981 8668 8663 9960 8669 8665 8662 1983 4942 10289 8666 8667 8661 7316 388


HC4345 15 Predicted regulation of cellular protein metabolic process 1981 8668 8663 5313 9960 8669 8665 1983 8662 1982 10289 8666 8667 8661 7316

79


5708 5707 5719 5706 51377 7879 5701 2 10213 5682 5705 9861 5700 5886 5704


8916 5707 5719 5701 5713 6782 55008 5710 5705 29978 5700 7316 29979 5886 11047


5708 5707 5719 5706 51377 5709 5702 10213 5713 5717 5705 9861 5700 5704 5718


5743 5707 5719 5706 5887 8647 5701 5713 5716 10299 9861 5705 5700 7316 5704


5707 5719 5706 55679 7979 5701 5713 55795 9861 3611 9097 5705 80227 5700 5718


5708 55147 5707 5719 5706 5701 5713 5716 9861 9097 5705 80227 5700 5704 7277


5707 5719 5706 471 7979 5714 5701 10213 5713 5716 9861 9097 5710 8644 5700


5707 5719 5706 51377 5709 5702 10213 5713 5717 5705 9861 5715 5700 5718 5704

HC4633 15 Predicted regulation of translational initiation 27335 1981 51386 8663 8668 8669 8664 1974 3646 8665 8662 10289 8666 8661 8667


HC4788 15 Predicted ubiquitin-dependent protein catabolic process 5707 51138 9318 5719 5706 6138 9097 10980 27063 5705 2873 5700 7316 5704 11047



5708 5707 5719 5706 7979 5701 5713 5716 5717 9861 5705 80227 5700 5704 134510



5708 5707 5719 5706 5701 10213 5713 10477 5705 9861 115992 5700 51366 1347 5704


55163 56893 5707 5719 5701 5713 6782 5710 5705 29978 5700 7316 5886 22983 11047

Complex



112858 5707 5719 5706 5701 5713 5716 4869 5717 9861 9097 5705 80227 5700 5704





56893 5707 5719 5706 23592 5701 5713 6782 5710 5705 29978 5700 7316 5886 11047


5707 5719 5706 7979 5702 5701 5713 5716 9861 9097 5705 5715 80227 5700 5704


5707 5719 5211 5706 55210 5716 10980 9861 9097 5705 80227 5700 7316 5704 55660


5708 5707 5719 5706 51377 5701 10213 5713 5682 5717 5705 9861 548593 5700 5704

80


5707 5719 5706 7979 5701 5713 5716 55795 9861 9097 5705 80227 5700 5704 5718


5707 5719 5706 5701 5713 5705 9097 5710 29978 5700 219988 7316 3304 5886 11047


5707 5719 5706 7979 5701 5713 9097 9861 5710 5705 80227 5700 26287 5704 11047


5708 5707 5719 5706 7979 5701 5713 5716 9861 9097 5710 5705 8644 2752 5700


5707 5719 5706 7979 5714 5701 5713 5716 9861 9097 5705 80227 5700 54973 29901


5707 5719 5706 7979 5887 1622 5701 5713 158 5710 7314 5705 5700 7316 5886


5707 5719 5706 7979 5701 5713 7805 5716 9690 9861 9097 5705 5700 7316 5704

HC5637 15 Predicted regulation of translation 1981 8668 8663 8669 1974 8672 1973 8665 8662 1977 1982 26986 8666 8661 8667


5707 51138 9318 5719 5706 5688 5701 81570 5716 5682 10980 9097 2873 5700 5704




159 5708 5707 5719 5706 60678 5701 5713 5716 9861 9097 5710 5700 7316 5704


56893 5707 5719 5706 7979 5701 5713 152559 6782 5710 5705 29978 5700 5886 11047


56893 5707 23549 5719 5701 5713 54495 6782 5710 5705 29978 5700 7316 5886 11047


5707 5719 5706 7979 5701 5713 5716 9861 9097 5705 375449 80227 5700 5704 7277

Complex



5708 5707 5719 5706 83858 5713 5716 9861 9097 5710 8565 5700 7316 5704 11047

HC5992 15 Predicted ATP catabolic process 5707 5719 5706 59345 5709 5701 5713 5716 10190 9861 9097 5705 80227 5700 5704



6224 5707 92840 5719 5706 5701 10213 5713 5716 9861 9097 5710 5700 7316 11047


5708 5707 5719 5706 6908 51377 5701 10213 5713 5682 9330 5705 9861 5700 5704


5708 5147 5707 5719 5706 5701 5713 5716 9861 9097 5705 84952 80227 5700 5704


5707 5719 5706 51377 5709 5887 10213 5713 5717 3838 5705 9861 5700 5718 5704


56893 5707 5719 5701 5713 6782 29978 5705 5710 284273 5700 7316 5886 11047 51132


5708 5707 5719 6477 5706 51377 5701 10213 5713 83940 5705 9861 23170 5700 5704

81




5708 5719 5706 5713 5716 6472 9097 9861 5710 5705 10289 5700 8667 7316 5704


3692 5707 5684 5692 5683 5688 10213 5695 5682 26137 3275 5700 5685 5686 5691


56893 5707 5719 23409 5701 5713 6782 5710 5705 29978 5700 7316 293 5886 11047


5708 5707 5719 5706 51377 5709 5701 10213 5713 5682 5717 5705 9861 5700 5704


5707 5719 5706 7979 5709 5701 5713 5716 9861 9097 5705 2783 80227 5700 5704



55147 5707 5719 5706 7979 5701 5713 5716 9861 9097 5705 80227 5700 5704 6117

HC6401 15 Predicted regulation of cellular protein metabolic process 1981 8668 8663 9960 8669 6472 8665 1983 8662 10289 8666 8667 8661 7316 4189


5708 6397 5707 5719 5706 51377 55207 10213 5713 5682 5705 9861 5700 5686 5704


5708 2058 5707 118980 5719 5706 5701 5713 5716 9861 9097 5710 5700 7316 5704


5708 5707 5719 5706 5713 5716 353 9861 9097 5710 5705 9114 5700 5704 11047

HC6578 15 Predicted protein targeting 9156 5509 60598 51305 5298 10971 7534 7531 3777 7533 10298 5501 7529 2810 7532

Complex



5707 51138 9318 5719 5706 7979 5701 81570 5716 10980 9097 7347 5705 2873 5700


5708 5707 5719 5706 3861 5701 6613 5713 6612 5716 9861 9097 80227 5700 5704



51520 5708 5707 5719 5706 83858 10213 5713 5716 9861 9097 5710 5700 7316 11047



5708 5707 5719 5706 51377 5701 7157 10213 5713 29117 5717 5705 9861 5700 5704


5708 5707 5719 5706 5701 5713 5716 9861 9097 5710 5705 80227 5700 5704 11047


5423 5707 5719 5706 7979 5714 5701 7374 5713 5716 9861 9097 80227 5700 11047


5707 5719 5706 471 5701 5713 5716 9861 9097 4200 5710 5705 5700 7316 5704


2730 5707 5719 5706 5701 10213 5713 55871 5716 9861 9097 5705 10289 80227 5700

82


56893 5479 5707 5719 7979 23350 5701 5713 6782 5710 5705 29978 5700 5886 11047


221496 5707 5719 5706 5701 5713 6782 5710 9097 5705 29978 5700 7316 5886 11047




5707 5719 5706 414301 7979 5714 5701 10213 5713 5716 9861 9097 80227 5700 11047


5708 5707 5719 5706 51377 706 5701 10213 5713 5705 9861 1155 5700 5711 5704


5707 5719 5706 7979 5701 5713 11066 9861 3611 9097 5705 5700 1642 5704 5718




5707 5719 5706 4286 5701 10213 5713 5716 9861 9097 3101 80227 5700 7316 5718


5708 5707 5719 5706 51377 5701 5713 5716 9861 5710 9097 5705 5700 7316 5704


10681 5719 5706 4597 5702 5701 5713 5716 9861 9097 5710 5705 5700 7316 5704


5708 5692 5719 5706 5683 5688 3009 5695 5705 9861 79888 5700 1347 5686 5704

Complex




5707 5719 5706 5701 5713 5716 5717 79751 9861 9097 5705 80227 5700 7316 5704


3305 56893 5707 5719 7979 23259 5701 5713 6782 5710 5705 29978 5700 5886 11047


3308 5707 5719 5706 57533 5709 5701 5713 5716 9861 9097 5705 80227 5700 5704

HC7882 15 Predicted posttranscriptional regulation of gene expression 1981 8668 8663 9960 8669 8665 6949 1983 8662 10289 8666 8667 57599 8661 1459


5707 5719 5706 5701 10213 5713 5716 9861 9097 5705 80227 5700 7316 5704 11047


94081 5708 5707 5719 5706 79876 5701 5713 5716 9861 9097 5705 80227 5700 5704


5708 5707 5719 5518 5706 51377 26173 5701 10213 5713 5705 9861 5700 8717 5704




5707 79228 5719 5706 3419 7979 5701 5713 5716 9861 9097 5710 5705 5700 5704

83


56893 5707 5719 5706 7979 5701 5713 5716 9861 5710 5705 80227 5700 5704 5886



5707 5719 5706 7979 5714 5701 5713 5716 5717 9861 9097 5705 80227 5700 5704



5708 5707 5719 5706 7979 5701 6613 5713 5716 10056 9861 9097 5710 5705 5700


5708 204 5707 5719 5706 5701 5713 5716 9861 9097 5710 11212 5700 7316 5704


5707 5684 23037 5719 5706 51377 5709 226 10213 5713 5717 5705 9861 5700 5704


5707 5719 5706 5701 5713 5716 9097 9861 5710 5705 29978 5700 5704 5886 11047


5708 5707 5719 5706 51377 5701 10213 5713 79029 5682 166378 5705 9861 5700 5704




5708 5707 5719 5706 5701 5713 5716 79751 9861 1080 9097 80227 5700 7316 5704

Complex




1013 5707 5719 1410 5706 51377 10213 5713 5717 506 5705 9861 5710 5700 5704


5708 5707 5719 5706 51377 5701 10213 5713 5682 10999 5705 9861 8678 5700 5704


1591 5708 5707 5719 5706 51377 5709 5701 10213 5713 9330 5705 9861 5700 5704



5708 5707 5719 5706 51377 5887 5701 10213 5713 5682 5717 5705 9861 5700 5704


5708 5707 5719 5706 5701 5713 9520 5716 9861 9097 5705 80227 5700 7316 5704


5707 5719 2806 5706 471 7979 5714 5701 10213 5713 5716 9861 9097 5710 5700


22938 5708 5707 5719 5706 51377 5702 10213 5713 5717 166378 5705 9861 5700 5704


5708 5707 10555 5719 5706 5701 5713 5716 9861 9097 5710 80227 5700 7316 5704


5708 5707 5719 5706 51377 5709 5701 10213 5713 5717 5705 9861 5700 5718 5704


5708 3376 5719 5706 5702 5713 5716 7534 5710 9861 9097 5705 10289 5700 5704

84


56893 5707 5719 8913 7979 5701 5713 6782 5710 5705 29978 5700 11052 5886 11047



128 5707 5719 55005 5706 51377 5702 7157 10213 5713 5717 5705 9861 5700 5704


5707 5719 5706 3419 3421 5701 5713 5716 9861 9097 5710 5705 80227 5700 5704


5707 5719 5706 3326 5701 5713 29978 5705 5710 5700 7316 27145 5704 5886 11047




56893 5707 5719 51499 5701 5713 27166 6782 5710 5705 29978 5700 7316 5886 11047


5707 5719 5706 23503 51377 5709 10213 5713 5717 5705 9861 5700 988 5704 5718



56893 5707 5719 5706 5701 5713 29978 5705 5710 2882 5700 7316 5704 5886 11047

Complex




1069 5707 5719 5706 7979 5887 5701 5713 5716 9861 9097 5705 80227 5700 5704



56893 6184 5707 79152 5719 5706 5701 5713 5710 5705 29978 5700 5704 5886 11047


5708 5707 5719 5706 5702 5701 7157 10213 5682 5717 7486 5705 5700 5711 5704


5708 5707 5719 5706 7979 5701 5713 5716 9690 9861 9097 5705 80227 5700 5704


8803 5707 8802 5719 5706 5701 10213 5713 5716 9861 9097 80227 5700 7316 5704


56893 5707 5719 7979 5701 51661 5713 6782 5710 5705 29978 5700 5886 1760 11047


5708 5707 5719 5706 5713 5716 6232 9861 5705 80227 5700 7316 11047 134510 5704


5707 5719 5706 5701 5713 29978 9097 5710 5705 55008 5700 7316 5704 5886 11047


5707 7317 5719 5706 51377 5709 10213 5713 5717 776 9097 5705 9861 5700 5704



5708 5707 5719 5706 51377 5887 5701 10213 5713 5710 9861 55768 5700 7415 5704

85



5708 5707 5719 5706 5713 5716 6232 9861 23148 5710 5705 5700 7316 11047 134510


5708 5707 5719 5706 51377 5701 10213 5713 5716 5710 9861 9097 5705 5700 7316 5704


5702 5713 5716 5710 7316 5704 5708 5707 5719 5706 51377 5701 10213 9861 9097 5705 5700 5711

HC1404 20 Literature PA700 complex 5702 5713 5716 5710 5715 5704 5708 5707 5719 5706 5709 5714 5701 10213 5717 5705 9861 5700 5711 5718

HC2273 31 Literature 40S ribosomal subunit, cytoplasmic

6202 6224 6209 6187 6230 6235 6204 6193 6229 6203 3921 6189 6218 6191 6234 6217 6210 6201 6233 6208 6205 6194 6223 6227 6206 2197 6188 6222 6231 6228 6207

Appendix B List of Dynamic Proteins in PCP-SILAC Salmonella Dataset Identified by Z-test

Table 5.2 Identification of dynamic proteins after Salmonella infection (Z-Test)

Gene

name IPI Protein name

Gene


PTRF IPI00176903 Cavin -1, Polymerase 1 and transcript release factor PAIRBP1 IPI00410693 PAI1 RNA-binding protein

HSPA5 IPI00003362 Heat Shock 70kDa Protein 5(Glucose-regulated protein, 78kDa) PSMC4 IPI00020042 26S protease regulatory subunit 6B

IMPDH2 IPI00291510 Inosine 5'-monophosphate dehydrogenase 2 MPD IPI00022745 Mevalonate decarboxylase

PDIA6 IPI00299571 Protein disulfide isomerase P5 PSMC5 IPI00023919 26S protease regulatory subunit 8

MCM3 IPI00013214 Minichromosome maintenance complex component 3 PSMD14 IPI00024821

26S proteasome non-ATPase regulatory subunit 14

CCT5 IPI00010720 Chaperonin containing TCP1, subunit 5 eIF3S10 IPI00029012 Eukaryotic translation initiation factor 3 subunit 10

CCT6 IPI00027626 Chaperonin containing TCP1, subunit 6a HIP IPI00032826 Hsc70-interacting protein

CDC46 IPI00018350 CDC46 homolog ADRM1 IPI00033030 110 kDa cell membrane glycoprotein

CDC47 IPI00299904 CDC47 homolog eIF3S12 IPI00033143 Eukaryotic translation iniation factor 3 subunit 12

86

CCT4 IPI00302927 Chaperonin containing TCP1, subunit 4 eIF3M IPI00102069 Eukaryotic translation iniation factor 3 subunit M

LAMBR IPI00413108 37 kDa laminin receptor precursor GRIPAP1 IPI00873904 GRIP1-associated protein 1

CCT7 IPI00018465 Chaperonin containing TCP1, subunit 7 PSMD12 IPI00185374 26S proteasome non-ATPase regulatory subunit 12

ACTB IPI00021440 Actin, cytplasmic 2 FKBP4 IPI00219005 51 kDa FK506-binding protein MCM6 IPI00031517 DNA-replication licensing factor MCM6 RPL22 IPI00219153 60S ribosomal protein L22 BM28 IPI00184330 DNA-replication licensing factor MCM2 RPL6 IPI00790342 60S ribosomal protein L6

CCT3 IPI00553185 Chaperonin containing TCP1, subunit 3 SMC1 IPI00291939 Structural maintenance of chromosomes protein 1A

DNAJ2 IPI00012535 DaJ homolog subfamily A member IWS1 IPI00296432 IWS1-like protein

HBP IPI00022228 High density lipoprotein-binding protein PSMD1 IPI00299608


Gene


Gene


GRP94 IPI00027230 94 kDa glucose-regulated protein RBAP46 IPI00395865 Histone acetyltransferase type B subunit 2

SQSTM1 IPI00179473 Sequestosome 1 MATR3 IPI00789551 Matrin-3

HNRNPA1 IPI00215965 Heterogeneous nuclear ribonucleoprotein A1 HSPC117 IPI00550689 tRNA-splicing ligase RtcB homolog

CCT1 IPI00290566 Chaperonin containing TCP1, subunit 1 LAP3 IPI00419237 Cytosol aminopeptidase CCT2 IPI00297779 Chaperonin containing TCP1, subunit 2 EIF3EIP IPI00465233 Eukaryotic initiation factor 3

HNRNPH1 IPI00479191 Heterogeneous nuclear ribonucleoprotein PSMD13 IPI00549672


CCT8 IPI00784090 Chaperonin containing TCP1, subunit 8 EIF3F IPI00654777 highly similar to Eukaryotic translation iniation factor 3 subunit 5

GRP170 IPI00000877 170 kDa glucose-regulated protein IGF2BP3 IPI00658000 IGF-II mRNA-binding protein 3

ERP57 IPI00025252 Disulfide isomerase ER-60 EIF3B IPI00719752 Eukaryotic translation iniation factor 3 subunit 9

AUF1 IPI00028888 AU-rich element RNA-binding protein TPR IPI00742682 TPR protein SND1 IPI00140420 100 kDa coactivator UBE2O IPI00783378 Ubiquitin carrier protein O RPS19 IPI00215780 40S ribosomal protein S19 HEXC IPI00477231 Beta-hexosaminidase RPS9 IPI00221088 40S ribosomal protein S9 SPTA2 IPI00844215 Alpha-II spectrin HSPA1 IPI00304925 Heat shock 70 kDa protein 1/2 DDX9 IPI00844578 ATP-dependent RNA helicase A ABBP1 IPI00334587 APOBEC1-binding protein 1 HK1 IPI00903226 Hexokinase type I ACAC IPI00396015 ACC-alpha EF1G IPI00937615 Eukaryotic translation elongation factor 1

87

gamma

KAP1 IPI00438229 KRAB-associated protein 1 PSMB6 IPI00000811 Macropain delta chain

MYL6 IPI00796366 highly similar to Myosin light polypeptide 6 KYNU IPI00003818 Kyneureninase

KPNA2 IPI00002214 Importin subunit alpha-2 FLN IPI00333541 Actin-binding protein 280

HNRNPR IPI00011937 Heterogeneous nuclear ribonucleoprotein R PFD6 IPI00005657 Prefoldin subunit 6

RPS18 IPI00013296 40S ribosomal protein S18 RPS2 IPI00013485 40S ribosomal protein S2 PSF IPI00010740 100 kDa DNA-pairing protein DDX5 IPI00017617 DEAD box protein 5 TPM4 IPI00010779 Tropomyosin alpha-4 chain

RPS14 IPI00026271 40S ribosomal protein S14 PDXDC1 IPI00384689 Pyridoxal-dependent decarboxylase domain-containing protein 1

Gene


Gene


KIAA1153 IPI00099311 tRNA(adenine-N(1)-)-methyltransferase PGDH3 IPI00011200 D-3 phosphoglycerate dehydrogenase

GFAT IPI00217952 D-fructose-6-phosphate amidotransferase 1 PSMD3 IPI00011603


RPS16 IPI00221092 40S ribosomal protein S16 AIMP2 IPI00011916 Aminoacyl tRNA synthase complex-interacting multifunctional protein 2

FUS IPI00260715 75 kDa DNA-pairing protein H2AFC IPI00291764 Histone H2A type 1 ALY IPI00328840 Transcriptional coactivator Aly/REF DEK IPI00020021 Protein DEK H4/A IPI00453473 Histone H4 LMN1 IPI00021405 70 kDa lamin NPM IPI00549248 Nucleolar phosphoprotein B23 EIF5 IPI00022648 Eukaryotic translation initation factor 5 H2BFD IPI00646240 Histone H2B FARS IPI00031820 Phenylalanine-tRNA ligase alpha chain

GNB2L1 IPI00848226 Cell proliferation-inducing gene 21 protein CAPL IPI00032313 Calvasculin

HNRNPU IPI00883857 Heterogeneous nuclear ribonucleoprotein U PSMD11 IPI00105598


HSP73 IPI00003865 Heat shock 70 kDa protein 8 EZR IPI00843975 Ezrin TMOD3 IPI00005087 Tropomodulin-3 PLS3 IPI00216694 Plastin-3

ARPC18 IPI00005160 Actin-related protein 2/3 complex subunit 1B TPM3 IPI00218319 Tropomyosin alpha-4 chain

SPTB2 IPI00005614 Beta-II spectrin GAPD IPI00219018 Glyceraldehyde-3-phosphate dehydrogenase

RPS10 IPI00008438 40S ribosomal protein S10 EIF3S2 IPI00012795 Eukaryotic translation initiation factor 3 subunit

RPS20 IPI00012493 40S ribosomal protein S20 EIF3G IPI00290460 Eukaryotic initiation factor 3 RNA-binding

88

subunit

RPS25 IPI00012750 40S ribosomal protein S25 TUBA1 IPI00007750 Alpha-tubulin 1

EIF3C IPI00016910 Eukaryotic translation initiation factor 3 subunit CBX3 IPI00297579 Chromobox protein homolog

RPS21 IPI00017448 40S ribosomal protein S21 EIF5B IPI00299254 Eukaryotic translation initation factor 5B ARP3 IPI00028091 Actin-like protein 3 FARSB IPI00300074 Phenylalanine-tRNA ligase beta chain RPS8 IPI00216587 40S ribosomal protein S8 RPL13A IPI00304612 60S ribosomal protein L13A RPS4 IPI00217030 40S ribosomal protein S4 CACYBP IPI00395627 Calcyclin-binding protein

HPG2 IPI00396378 Heterogeneous nuclear ribonucleoprotein A2/B1 RPL26 IPI00433834 60S ribosomal protein L26

Gene


Gene


RPS3A IPI00419880 40S ribosomal protein S3A UCHL5 IPI00642374 Ubiquitin carboxyl-terminal hydrolase RPS15 IPI00479058 40S ribosomal protein S15 ASNS IPI00554777 Asparagine synthetase

FUBP2 IPI00479786 Far upstream element-binding protein 2 RPL14 IPI00555744 60S ribosomal protein L14

RPS24 IPI00915363 40S ribosomal protein S24 NME2 IPI00604590 Nucleoside diphosphate kinase PSMD5 IPI00002134 26S protease subunit S5 basic CAPRIN1 IPI00783872 Caprin-1

CRDBP IPI00008557 Coding region determinant-binding protein CSDE1 IPI00844264 Cold shock domain-containing protein E1

ERP70 IPI00009904 Endoplasmic reticulum resident protein 70 PSMD2 IPI00012268

26S proteasome non-ATPase regulatory subunit

PSMC1 IPI00011126 26S protease regulatory subunit 4 SSB IPI00009032 Sjoegren syndrome type B antigen RPS3 IPI00011253 40S ribosomal protein S3 COPS7B IPI00009301 COP9 signalosome complex subunit 7b

EIF3E IPI00013068 Eukaryotic translation iniation factor 3 subunit 6 EEF1E1 IPI00003588

Eukaryotic translation elongation factor 1 epsilon

ACTN4 IPI00013808 Alpha-actinin-4 PSMC6 IPI00926977 26S protease regulatory subunit 10B MM1 IPI00015361 C-Myc-binding protein Mm-1 PSMB7 IPI00003217 Macropain chain Z

CAS IPI00022744 Cellular apoptosis susceptibility protein EFTUD2 IPI00003519

Elongation factor Tu GTP-binding domain-containing protein

NSEP1 IPI00031812 CCAAT-binding transcription factor 1 subunit PSME3 IPI00005260 Proteasome activator complex subunit 4

DBC1 IPI00182757 Deleted in breast cancer gene 1 protein ACO1 IPI00008485 Cytoplasmic aconitate hydrolase

MSN IPI00219365 Membrane-organizing extension protein 1 ATP6A1 IPI00007682 Vacuolar ATPase Isoform VA68

PSME3 IPI00219445 11S regulator complex subunit gamma CAPE IPI00007927 Chromosome-associated protein E

89

RBS13 IPI00221089 40S ribosomal protein 13 PAB1 IPI00008524 Polyadenylate-binding protein 1 RBS15A IPI00221091 40S ribosomal protein S15A MYH9 IPI00019502 Cellular myosin heavy chain RBS17 IPI00221093 40S ribosomal protein S17 G22P2 IPI00220834 86 kDa subunit of Ku antigen

CRM1 IPI00298961 Chromosome region maintenance 1 protein homolog CAP43 IPI00022078 Differentation-related gene 1 protein

HGRG8 IPI00306043 CLL-associated antigen KW-14 TLP46 IPI00171438 Thioredoxin domain-containing protein 5

CAPZA2 IPI00026182 F-actin-capping protein subunit alpha-Z RPL12 IPI00024933 60S ribosomal protein L12

Gene


Gene


DBP1 IPI00396435 ATP-dependent RNA helicase 46 CAPN4 IPI00025084 Calcium-activated neutral proteinase small subunit

ANX2 IPI00418169 Annexin A2 FAS IPI00026781 S-malonyltransferase

BAG3 IPI00641582 BAG family molecular chaperone regulator 3 CLIP1 IPI00013455

CAP-Gly domain-containing linker protein 1

EIFS3 IPI00647650 highly similar to Eukaryotic translation initiation factor 3 subunit 3 PSMD6 IPI00014151


RPS28 IPI00719622 40S ribosomal protein 28 G2AN IPI00383581 Alpha-glucosidase 2 FLNB IPI00900293 Filamin B NKEFB IPI00027350 Natural killer cell-enhancing factor B

PSME2 IPI00943181 Putative uncharacterized protein PSME2 PSMB2 IPI00028006 Macropain subunit C7-I

DNCLI1 IPI00007675 Cytoplasmic dynein 1 light intermediate chain 1 PUS7 IPI00044761 Pseudouridylate synthase 7 homolog

RPS5 IPI00008433 40S ribosomal protein 5 LARS IPI00103994 Leucine-tRNA ligase RPS12 IPI00013917 40S ribosomal protein 12 PTB IPI00183626 Polypyrimidine tract binding protein 1 PLEC1 IPI00014898 Plectin-1 LARP IPI00185919 La-related protein 1 PSMC3 IPI00018398 26S protease regulatory subunit 6A EEF2 IPI00186290 Elongation factor 2 PSMD7 IPI00019927 26S protease regulatory subunit 6B H1F5 IPI00217468 Histone H1.5 PSMC2 IPI00021435 26S protease regulatory subunit 7 G6PD IPI00216008 Glucose-6-phosphate 1-dehydrogenase RPS6 IPI00021840 40S ribosomal protein S6 RBAP48 IPI00328319 Chromatin assembly factor 1 subunit C RPS11 IPI00025091 40S ribosomal protein S11 IMP48 IPI00398009 Importin-4

H1F2 IPI00217465 Histone H1.2 ISOC1 IPI00304082 Isochorismatase domain-containing protein

LAMB2 IPI00294879 Laminin B2 chain

Documents

PROTEOMIC ANALYSIS OF A DYNAMIC SALMONELLA- · PDF filePROTEOMIC ANALYSIS OF A DYNAMIC SALMONELLA- ... (PCP-SILAC) is a recent ... a fluorescent assay was used to measure protein synthesis