58
PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA HYPOGYNA AND THRAUSTOTHECA CLAVATA (OOMYCOTA, STRAMENOPILA): IMPLICATIONS FOR THE ORIGIN OF CHROMALVEOLATE PLASTIDS AND THE ‘GREEN GENE’ HYPOTHESIS Lindsay Rukenbrod A Thesis Submitted to the University of North Carolina Wilmington in Partial Fulfillment of the Requirements for the Degree of Master of Science Center for Marine Science University of North Carolina Wilmington 2012 Approved by Advisory Committee D. Wilson Freshwater Jeremy Morgan Allison Taylor J. Craig Bailey Chair Accepted by Dean, Graduate School

PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA HYPOGYNA AND THRAUSTOTHECA CLAVATA (OOMYCOTA,

STRAMENOPILA): IMPLICATIONS FOR THE ORIGIN OF CHROMALVEOLATE PLASTIDS AND THE ‘GREEN GENE’ HYPOTHESIS

Lindsay Rukenbrod

A Thesis Submitted to the University of North Carolina Wilmington in Partial Fulfillment

of the Requirements for the Degree of Master of Science

Center for Marine Science

University of North Carolina Wilmington

2012

Approved by

Advisory Committee

D. Wilson Freshwater Jeremy Morgan Allison Taylor J. Craig Bailey Chair

Accepted by

Dean, Graduate School

Page 2: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

ii

This thesis has been prepared in the style and format consistent with the Journal of Eukaryotic Microbiology.

Page 3: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

iii

TABLE OF CONTENTS

ABSTRACT .....................................................................................................................iv

ACKNOWLEDGMENTS ..................................................................................................vi

DEDICATION ................................................................................................................. vii

LIST OF TABLES .......................................................................................................... viii

LIST OF FIGURES ..........................................................................................................ix

CHAPTER 1: Implications for the origin of chromalveolate plastids ............................... X

INTRODUCTION .................................................................................................. 1

METHODS............................................................................................................ 3

RESULTS AND DISCUSSION ............................................................................. 4

Revised Hypotheses for the Evolution of Chromalveolate Plastids ............ 6

CHAPTER 2: Do chromalveolate genomes encode ‘green genes’? ............................ 15

INTRODUCTION ................................................................................................ 16

METHODS.......................................................................................................... 18

RESULTS AND DISCUSSION ........................................................................... 19

Green Genes in Oomycetes and Other Chromalveolates? ...................... 22

SUPPLEMENTAL INFORMATION................................................................................ 32

LITERATURE CITED .................................................................................................... 41

Page 4: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

iv

ABSTRACT

Chapter 1

The chromalveolate hypothesis predicts that extant nonphotosynthetic stramenopiles

are secondarily nonphotosynthetic and derived from ancestors bearing a secondary red-

type plastid. To test this hypothesis, proteomes of the oomycetes Achlya hypogyna and

Thraustotheca clavata were canvassed for plastid-targeted genes. Proteins for each

species encoding putative plastid-targeting signal peptides were identified, annotated,

and assigned to protein families if possible. Forty-six candidate proteins were culled

from the two genomes. Bioinformatic analyses revealed that the proteomes of Achlya

and Thraustotheca do not encode plastid-targeted genes acquired by endosymbiotic

gene transfer. All proteins possessing non-mitochondrial-targeting signal peptides

identified were judged to belong to the secretome (i.e, extracellularly secreted proteins).

These results indicate that oomycetes are ancestrally aplastidic stramenopiles and do

not support the chromalveolate theory of plastid evolution. Revised hypotheses for the

origin of plastids characterized by chlorophylls a and c and fucoxanthin are presented. It

is concluded that alveolate and stramenopile plastids are likely tertiary or higher order

plastids, not secondary plastids.

Chapter 2

The hypothesis that a green algal symbiosis preceded the red algal symbiont that gave

rise to red-type plastids in the ancestors of the chromalveolates is reexamined. A

network approach was used to detect nuclear encoded proteins from the genomes of

Achlya hypogyna, Thraustotheca clavata, other oomycetes, and other chromalveolates

Page 5: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

v

that cluster with green algal genes. Twelve oomycete proteins clustering with green

algal genes at high stringency were annotated and selected for further analyses.

Representative homologs from all other eukaryotic taxa available were aligned to

sequences comprising each network and maximum likelihood trees were constructed

from these alignments. Protein trees derived from these data exhibited obvious errors

resulting from taxon biases and heterotachy. These results argue that ‘green genes’

detected in phylogenomics studies are artifactual and not indicative of endosymbiotic

gene transfer.

Page 6: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

vi

ACKNOWLEDGMENTS

My thanks go to my advisor, Dr. J. Craig Bailey, whose enthusiasm about

molecular protistology caught my interest in the very beginning of my scientific

education. His continuous encouragement, wit, and sense of humor made this journey

an enjoyable one. Ian Misner and Dr. Chris Lane of the University of Rhode Island have

also been instrumental in my education, providing feedback and technical support in my

research. I’d also like to thank my committee members, Dr. D. Wilson Freshwater, Dr.

Jeremy Morgan, and Dr. Allison Taylor, for their encouragement and flexibility

throughout this process.

My lab mates past and present, particularly Cory Dashiell, Erika Shwarz, Ashley

Hayes, and Allison Martin, helped me maintain my focus over the years throughout

failed DNA extractions, computer malfunctions, approaching deadlines, and many other

graduate school related challenges.

The Department of Biology and Marine Biology, the Center for Marine Science

and the National Science Foundation provided financial support for my education and

research.

Finally, I’d like to thank my parents and my husband for supporting me every step

of the way.

Page 7: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

vii

DEDICATION

I’d like to dedicate this to my mother, whose endless patience has allowed me to

explore life with few restrictions and overwhelming love and support.

Page 8: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

viii

LIST OF TABLES

Table Page

Chapter 1

1. Protein IDs for 46 hypothetical proteins detected in the genomes of

Thraustotheca and/or Achlya.. ............................................................................ 9

2. Protein ID numbers, annotations and protein family designations.. ................... 11

3. Proteins sorted into one of 14 unique protein families.. ..................................... 13

4. List of seven proteins from the Achlya and Thraustotheca and putative

homologs found in the Arabidopsis thaliana plastid proteome.. ........................ 14

Chapter 2

1. List of 12 annotated proteins from the Achlya and/or Thraustotheca proteomes

or other oomycetes found in EGNs . ........................................................................ 24

Page 9: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

ix

LIST OF FIGURES

Figure Page

Chapter 1

1. Hypotheses for the origin of complex, higher order

chlorophyll a+c-containing plastids in chromalveolates. ....................................... 8

Chapter 2

1. Three examples of putative green genes in oomycete

genomes based on EGN analysis. .................................................................... 25

2. DEXDc ML tree ................................................................................................... 26

3. RPB ML tree ....................................................................................................... 27

4. ALDH ML tree ..................................................................................................... 28

5. TOR-containing kinase ML tree .......................................................................... 29

6. YAK1 ML tree: .................................................................................................... 30

7. ALS ML tree....................................................................................................... 31

Page 10: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

x

CHAPTER 1: Implications for the origin of chromalveolate plastids.

Page 11: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

INTRODUCTION

The evolutionary origin and subsequent movement of secondary and higher order

plastids among photosynthetic eukaryotes is the subject of intense debate. The principal

key to unraveling the evolutionary history of plastids is an accurate understanding of the

relationships among both host and plastid lineages (Archibald 2009; Green 2011). This

goal is hampered by the mosaic nature of eukaryotic genomes comprised of lineage-

specific genes inherited vertically, thousands of genes acquired by endosymbiotic gene

transfer (EGT), and genes obtained via lateral gene transfer (LGT) (Archibald 2008;

Green 2011; Keeling 2009; Larkum 2007).

The chromalveolate hypothesis posits that the alveolates, cryptomonads,

haptophytes and stramenopiles are monophyletic and that the last common ancestor of

these lineages was a photosynthetic alga bearing a red-type plastid (Cavalier-Smith

1999; 2003). This notion is supported, in the first instance, by the fact that

photosynthetic members of these chlorophyll a+c-containing groups all possess red-

type plastids surrounded by three or four unit membranes [the so-called chloroplast-

endoplasmic reticulum, or CER], a feature indicative of secondary endosymbiosis

(Dodge 1975; Foth and McFadden 2003; Guillot and Gibbs 1980a, b; Gibbs 1981a, b;

Köhler et al. 1997). Second, nuclear-encoded plastid-targeted proteins in these algae

are characterized by the presence of a 5’ bipartite signal sequence that directs gene

products to the plastid and across the outer- and inner-pair of plastid membranes (Kroth

2002; Soll and Schleiff 2004). In terms of coding capacity, gene content, and

organization the plastid genomes of chromalveolates resemble those of red algae far

Page 12: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

2

more closely than they resemble the plastid genomes of green algae (Delwiche 1999;

Keeling 2004; Yoon et al. 2002). Cavalier-Smith (1999) originally emphasized the

chromalveolate hypothesis is consistent with idea that the chloroplast endoplasmic-

reticulum (CER) and complex protein-trafficking systems that characterize

chromalveolates are unlikely to have evolved independently on different occasions (see

Kroth 2002; Ralph et al. 2004).

Over the last decade, tests of the ‘chromalveolate’ concept has been the subject

– implicitly or explicitly – of numerous broad-scale phylogenetic studies. The

chromalveolates have not been recovered as a monophyletic group in any study

(Archibald 2009, Baurain et al. 2010). More recent studies imply the relationships

among chromalveolate host cells and their plastids is more complex than originally

supposed, perhaps involving tertiary and higher-order transfers among hosts (Archibald

2009; Bodyl 2005; Keeling 2004; Sanchez-Puerta and Delwiche 2008). In this paper

the chromalveolate hypothesis is re-examined in light of new genomic data available for

nonphotosynthetic members of the Stramenopila.

The stramenopiles, one of the four principal taxa included in the Chromalveolata,

are divided into two groups. (i)The ‘photosynthetic stramenopiles’, ‘heterokont algae’ or

‘ochrophytes’ - is comprised of chlorophyll a+c-containing photosynthetic algae

including phaeophytes, chrysophytes, and diatoms, eustigmatophytes, pelagophytes,

and xanthophytes (Lee et al. 2000). (ii) Nonphotosynthetic organisms that are

bactivorous, parasitic or saprobic heterotrophs in nature including bicosoecieds,

hyphochytrids, labyrinthulids, oomycetes, thraustochytrids, among others (Lee et al.

Page 13: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

3

2000). The oomycetes are the most diverse, well studied, and economically important

of all nonphotosynthetic stramenopiles.

The chromalveolate hypothesis implies that extant aplastidic stramenopiles are

derived from ancestors that once possessed a secondary red-type plastid. However,

there is no ultrastructural or DNA evidence suggesting that bicosoecieds, hyphochytrids,

labyrinthulids, oomycetes, or thraustochytrids possess, or possessed in the past, a

plastid. Furthermore, ultrastructural or DNA sequence evidence for cryptic plastids in

these organisms is absent or controversial (Lee et al 2000; Reyes-Prieto et al. 2008;

Slamovits and Keeling 2008; Stiller et al. 2009).

In this study the proteomes of the oomycetes Achlya hypogyna and

Thraustotheca clavata, were canvassed in search of photosynthesis related genes.

METHODS

Full length predicted proteins were obtained from ongoing genome sequencing projects

for Achlya hypogyna (ATCC48635) and Thraustotheca clavta (ATCC34112) estimated

to encode 17,430 and 12,154 predicted proteins, respectively; additional details will be

published separately. The Achlya and Thraustotheca proteomes were searched for

possible plastid-targeted genes using the signal peptide prediction program ChloroP

(v.1.1) (Emanuelsson et al. 1999). Hypothetical proteins returned from these searches

were subsequently analyzed using SignalP (v.4.0) (Petersen et al. 2011), annotated and

assigned to protein families if possible using the Conserved Domain Database (CDD)

(Marchler-Bauer et al. 2007). Mitochondria-targeted proteins and proteins possessing

Page 14: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

4

transmembrane regions identified using TmHMM (v2.0) were removed from the data set

(Krogh et al. 2001). Searches for heterokont-like bipartite plastid-targeting peptides,

consisting of both signal and transit peptide motifs, were conducted using HECTAR

(Gruber et al. 2007; Gschloessl et al. 2008; Waller et al. 2000). Finally, the oomycete

proteins were BLASTed against the Arabidopsis thaliana plastid proteome database

(which includes plastid- and nuclear-encoded plastid targeted proteins) using plprot

v.2.3 (Baginsky et al. 2005; Kleffmann et al. 2004; 2006).

RESULTS AND DISCUSSION

The chromalveolate hypothesis implies that the ancestors of oomycetes were

photosynthetic organisms bearing red-type plastids and putative plastid-related genes

have been reported from the genomes of the plant pathogens Phytophthora ramorum

and P. sojae (Tyler et al. 2006). The competing hypothesis is the long-held view that

oomycetes are ancestrally aplastidic. It is possible that the ancestors of oomycetes

were photosynthetic but that extant members of group have not retained any plastid-

associated genes. On the other hand, empirical data including studies of

apicomplexans, dinoflagellates and other taxa imply plastid-associated genes are

unlikely to be completely purged from the genome even in organisms where a vestigal,

nonphotosynthetic plastid is absent (Barbrook et al. 2006; de Koning and Keeling 2004;

Matsuzuki et al. 2008; Wilson 2004; Sanchez-Puerta et al. 2007).

Thirty hypothetical proteins from the Achlya genome and 16 from the

Thraustotheca genome putatively possessing a 5’ plastid-targeting signal peptide were

Page 15: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

5

identified (Table 1). Of these 46 proteins 22 are presently characterized as hypothetical

proteins of unknown function; 24 of the proteins were annotated (<1.00E-25) and found

to represent 14 unique protein families (Tables 2, 3).

BLASTp queries revealed that none of the oomycete proteins (Table 2) are

encoded by the 271 eukaryotic plastid genomes sequenced to date. None of the 46

presequences examined here possess the ASAFP (Y/W/L) motif necessary for plastid

import in diatoms, although the significance of this observation is unclear (Gruber et al.

2007) (see supplementary Tables S1 and S2).

Putative homologs to seven of the oomycete proteins were detected in the A.

thaliana plastid proteome (Table 4). These seven oomycete proteins are more-or-less

distant relatives of three A. thaliana genes. Both Achlya and Thraustotheca encode

proteins similar to the zinc-finger type WRKY1 DNA-binding transcription factor that

plays a role in disease resistance in A. thaliana (Dong et al. 2003; Shindo et al. 2012)

Three Achlya and one Thraustotheca proteins putatively encoding cysteine proteinase

RD21A are shared in common with the A. thaliana plastid proteome. Finally, a single

Achlya protein distantly related (6E-17) to A. thaliana aldehyde dehydrogenase (ALDH)

was also detected.

These three genes are not indicators for photosynthesis per se because

homologs have been detected from across the tree of life in photosynthetic (e.g., plants

and green algae) and nonphotosynthetic organisms (e.g., eubacteria, animals, fungi,

and the rhizarian Dictyostelium). Homologs, more closely related to the Achlya and

Thraustotheca proteins, to each of these putative genes have been previously detected

Page 16: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

6

in the genomes of Phytophthora infestans, P. sojae (Pythiales) and the white rust

Albugo laibachii (Tyler et al. 2006).

The annotated proteins recovered in this study include nine know to belong to

oomycete secretomes and six of these are common proteases such as chitinase and

cellulase (Tables 2, 3: Birch et al. 2006; Gaulin et al. 2008; Kamoun 2006; Levesque et

al. 2010). One of the proteins belongs to the elicitin family; a family of virulence genes

unique to oomycetes (Jiang et al. 2006). Based upon these data, plastid-associated

genes are not present in the Achlya or Thraustotheca predicted proteomes.

Revised Hypotheses for the Evolution of Chromalveolate Plastids

These data, as well as the study by Stiller et al. (2009), indicate that oomycetes are

ancestrally aplastidic despite reports to the contrary (Tyler et al. 2006). This information

and the results of recent phylogenomics investigations have been synthesized and

revised hypotheses for the evolution of chromalveolate plastids are presented in Figures

1 and 2. These diagrams reflect a number of assumptions that are enumerated for the

sake of clarity. (i) The Chromalveolata sensu stricto is paraphyletic (e.g., , Iida et al.

2007; Khan et al. 2007; reviewed in Green 2011; Rogers et al. 2007). (ii) )omycetes, all

other heterotrophic stramenopiles, as well as the ciliates are ancestrally aplastidic

(Archibald 2008; Reyes-Prieto et al. 2008; Tyler et al. 2006). (iii) The SAR clade is

recognized as natural (Burki et al. 2007; Hackett et al. 2007; Lane & Archibald 2008).

Fourth, recent studies imply that SAR and Hacrobia host cells are likely distantly related

(Baurain et al. 2010; Hackett et al. 2007; Parfrey et al. 2010). For these reasons, no

Page 17: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

7

specifically defined relationship between SAR and Hacrobia host cells is implied in

Figure 1. The diagrams comprising Figure 1 are drawn under the assumption that the

Hacrobia is monophyletic (Burki et al. 2007; Hackett et al. 2007; Harper et al. 2005,

Patron et al. 2007).

These hypotheses share elements in common with prior models of

chromalveolate plastid evolution in which multiple plastid acquisitions (or plastid

replacements) are inferred via serial endosymbiotic transfer (Archibald 2008; Bodyl

2005; Bodyl et al. 2009; Bodyl and Moszczynski 2006; Sanchez-Puerta & Delwiche

2008). Two predictions derived from these models bear emphasizing: (1) Alveolates

and Stramenopiles likely possess tertiary or quarternary plastids and (2) it is

conceivable that one of these taxa, the alveolates or stramenopiles, may have obtained

their plastid from the other (Fig. 1). Finally, it is noted that the number of membranes

surrounding higher-order, complex plastids seems to be fixed at four or less.

Page 18: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

8

Fig. 1 Hypotheses for the origin of complex, higher order chlorophyll a+c-containing plastids in chromalveolates. (A) Independent acquisition of a tertiary (3°) plastid in the alveolate and stramenopile lineages from the Hacrobia lineage. (B) Serial endosymbiotic transfer resulting in a quarternary (4°) alveolate plastid from the 3° stramenopile plastid. (C) ) Serial endosymbiotic transfer resulting in a 4° stramenopile plastid from the 3° alveolate plastid.

Page 19: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

9

Table 1. Protein IDs for 46 hypothetical proteins detected in the genomes of Thraustotheca and/or Achlya characterized by the presence of a putative plastid-targeting 5’ signal peptide sequence. ChloroP was used to detect classical plastid transit peptides. HECTOR was used to search for bipartite plastid targeting leader sequences characteristic of stramenopiles and other chromalveolates (Kilian and Kroth 2003, McFadden and van Dooren 2004, Vesteg et al. 2009).

Protein ID SignalP ChloroP HECTAR

Thraustotheca clavata

THRCLA_02069 Y Y Chloroplast THRCLA_03737 Y Y Signal peptide THRCLA_03876 Y Y Signal peptide THRCLA_04285 Y - Signal peptide THRCLA_04386 Y Y Signal peptide THRCLA_04952 N Y Signal peptide THRCLA_05863 Y Y Signal peptide THRCLA_06099 Y Y Signal peptide THRCLA_07047 Y Y Signal peptide THRCLA_08011 Y - Signal peptide THRCLA_10855 N - Signal peptide THRCLA_10997 N - Signal peptide THRCLA_11248 Y Y No N-terminal target

peptide found THRCLA_11271 Y Y Chloroplast THRCLA_11391 Y Y Signal peptide THRCLA_11516 Y Y Signal peptide

Achlya hypogyna

ACHHYP_00269 Y Y Signal Peptide ACHHYP_01095 Y - Signal peptide ACHHYP_01226 Y - Signal peptide ACHHYP_01546 Y Y Signal peptide ACHHYP_02169 Y Y Chloroplast ACHHYP_02305 Y Y Signal peptide ACHHYP_03044 Y Y Signal peptide ACHHYP_03052 N - Signal peptide ACHHYP_04549 Y Y Chloroplast ACHHYP_04706 Y Y Signal peptide ACHHYP_04908 Y Y Signal peptide ACHHYP_05005 Y Y Signal peptide

A

B

A

B

Page 20: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

10

Table 1 cont

ACHHYP_05180 Y Y Signal peptide

ACHHYP_05326 Y Y Signal peptide

ACHHYP_05770 Y Y Signal peptide

ACHHYP_06287 Y - Signal peptide

ACHHYP_06505 Y - Signal peptide

ACHHYP_06977 Y Y Signal peptide

ACHHYP_07400 Y Y Chloroplast

ACHHYP_08323 Y - Signal peptide

ACHHYP_09221 Y Y Chloroplast

ACHHYP_09519 Y Y Chloroplast

ACHHYP_10824 Y Y Signal peptide

ACHHYP_11025 Y Y Chloroplast

ACHHYP_11286 Y - Signal peptide

ACHHYP_11397 Y Y Signal peptide

ACHHYP_12628 Y - Chloroplast

ACHHYP_13722 Y Y Chloroplast

ACHHYP_14385 Y Y Signal peptide

ACHHYP_15409 Y Y Chloroplast

Page 21: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

11

Table 2. Protein ID numbers, annotations (<1.00E-25), and protein family designations for 46 proteins from the Thraustotheca and Achlya genomes putatively possessing 5’ plastid-targeting signal peptides.

Gene/Protein ID Annotation pfam

Thraustotheca clavate

THRCLA_02069 putative GPI-anchored serine-rich hypothetical protein _

THRCLA_03737 cd05384: SCP_PRY1_like [COG2340] pfam00188:

THRCLA_03876 hypothetical protein, with EGF-like motif _

THRCLA_04285 Kazal-type serine proteinase inhibitor pfam7648

THRCLA_04386 hypothetical protein _

THRCLA_04952 hypothetical protein _

THRCLA_05863 hypothetical protein _

THRCLA_06099 putative GPI-anchored serine-rich hypothetical protein _

THRCLA_07047 hypothetical protein _

THRCLA_08011 cysteine protease family C01A, putative pfam00112

THRCLA_10855 hypothetical protein _

THRCLA_10997 chitinase D-like pfam00704

THRCLA_11248 hypothetical protein, unknown function _

THRCLA_11271 hypothetical protein, elicitin superfamily pfam00964

THRCLA_11391 beta-N-acetylglucosaminidase pfam00728

THRCLA_11516 hypothetical protein, unknown function _

Achlya hypogyna

ACHHYP_00269 putative GPI-anchored serine-rich hypothetical protein _

ACHHYP_01095 beta-N-acetylglucosaminidase pfam00728

ACHHYP_01226 hypothetical protein pfam12937

ACHHYP_01546 hypothetical protein _

ACHHYP_02169 trypsin-like serine protease pfam13365

ACHHYP_02305 putative GPI-anchored serine-rich hypothetical protein _

ACHHYP_03044 putative chitinase-like carbohydrate-binding protein pfam00704

ACHHYP_03052 hypothetical protein _

ACHHYP_04549 hypothetical protein _

ACHHYP_04706 hypothetical protein encoding ricin_B_lectin pfam00652

ACHHYP_04908 hypothetical protein _

ACHHYP_05005 puative D-lactate dehydrogenase pfam01565

Page 22: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

12

Table 2 cont

ACHHYP_05180 hypothetical protein _

ACHHYP_05326 Cellulose pfam00150

ACHHYP_05770 hypothetical protein _

ACHHYP_06287 hypothetical protein _

ACHHYP_06505 papain family cysteine protease pfam00112

ACHHYP_06977 hypothetical protein _

ACHHYP_07400 hypothetical protein _

ACHHYP_08323 hypothetical protein containing PAN domain pfam00024

ACHHYP_09221 hypothetical protein _

ACHHYP_09519 hypothetical protein encoding ricin_B_lectin pfam00652

ACHHYP_10824 ankyrin repeat protein pfam12796

ACHHYP_11025 hypothetical protein _

ACHHYP_11286 aldehyde dehydrogenase pfam0017

ACHHYP_11397 hypothetical protein _

ACHHYP_12628 papain-like cysteine protease C1 pfam00112

ACHHYP_13722 hypothetical protein _

ACHHYP_14385 hypothetical protein _

ACHHYP_15409 papain-like cysteine protease C1 pfam00112

Page 23: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

13

Table 3. Proteins investigated in this study were sorted into one of 14 unique protein families, which are listed below. Note that all proteins investigated (see Table 2) are predicted to have a 5’ signal peptide and that nine of the 14 families include secreted proteins. Six of the families include proteases and that the elicitin family of virulence proteins are secreted extracellularly and is unique to oomycetes.

pfam ID Protein ID Protein family / Conserved domains

00188 THRCLA_03737 Cysteine-rich secretory protein family

07648 THRCLA_04285 Kazal_2: Kazal-type serine protease inhibitor domain

00112 THRCLA_08011 ACHHYP_06505 ACHHYP_12628 ACHHYP_15409

Peptidase_C1: Papain family cysteine protease

00964 THRCLA_11271 Elicitin

00728 THRCLA_11391 ACHHYP_01095

Glyco_hydro_20: Glycosyl hydrolase family 20, catalytic domain

12937 ACHHYP_01226 F-box-like

13365 ACHHYP_02169 Trypsin_2: Trypsin-like peptidase domain

00704 THRCLA_10997 ACHHYP_03044

Glyco_hydro_18: Glycosyl hydrolases family 18

00652 ACHHYP_04706 ACHHYP_09519

Ricin_B_lectin: Ricin-type beta-trefoil lectin domain

01565 ACHHYP_05005 FAD_binding_4: FAD binding domain

00150 ACHHYP_05326 Cellulase: Cellulase (glycosyl hydrolase family 5)

00024 ACHHYP_08323 PAN_1: PAN domain

12796 ACHHYP_10824 Ank_2: Ankyrin repeats

0017 ACHHYP_11286 aldehyde dehydrogenase superfamily (ALDH-SF)

Page 24: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

14

Table 4. List of seven proteins from the Achlya hypogyna and Thraustotheca clavata oomycete genomes and putative homologs found in the Arabidopsis thaliana plastid proteome. Reference refers to functional studies of the genes identified in this analysis.

Oomycete A. thaliana

plastid protein ID proteome ID Gene annotation E - value Reference ACH_05770 plp_at_01492 disease resistance protein

related to DNA-binding protein WRKY1

2.00E-20

THR_04952 plp_at_01492 disease resistance protein related to DNA-binding protein WRKY1

7.00E-23

THR_08011 plp_at_00089 cysteine proteinase RD21A (=thiol protease RD21A)

4.00E-53 Shindo et al. 2012

ACH_15409 plp_at_00089 cysteine proteinase RD21A (=thiol protease RD21A)

1.00E-24 Shindo et al. 2012

ACH_12628 plp_at_00089 cysteine proteinase RD21A (=thiol protease RD21A)

1.00E-18 Shindo et al. 2012

ACH_06505 plp_at_00089 cysteine proteinase RD21A (=thiol protease RD21A)

9.00E-47 Shindo et al. 2012

ACH_11286 plp_at_00466 aldehyde dehydrogenase (ALDH)

6.00E-17

Page 25: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

CHAPTER 2: Do chromalveolate genomes encode ‘green genes’? 1

2 3

Page 26: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

16

INTRODUCTION 1

One of the most vexing problems in eukaryote systematics is the 2

interrelationships among the so-called ‘chromalveolates’ (Archibald 2008; Cavalier-3

Smith 1999; Green 2011; Keeling 2004). The Chromalveolata is a paraphyletic taxon 4

whose members can be divided into two groups: The first group (the SAR clade) 5

includes the Alveolates (apicomplexans, dinoflagellates, and ciliates) that are sister to 6

Stramenopiles (including phaeophytes, chrysophytes, oomycetes). In turn, these two 7

clades are sister to the Rhizaria, a group principally comprised of free-living amoebae 8

(Burki et al. 2007; Hackett et al. 2007; Lane and Archibald 2008; Rogers et al. 2007). 9

The second group, the Hacrobia, includes cryptomonads and haptophytes and lesser-10

known relatives such as the telonemids, centrohelids, and picobiliphytes (Burki et al. 11

2007; Elias and Archibald 2009; Hackett et al. 2007; Okamoto et al. 2009; Rice and 12

Palmer 2006; Patron et al. 2007). The exact relationship between host cells and plastids 13

belonging to members of the SAR and Hacrobia clades is unclear (Baurain et al. 2010; 14

Harper et al. 2005). Despite these uncertainties, it is clear that all photosynthetic 15

chromalveolates possess three or four membrane-bound secondary or higher-order 16

plastids ultimately derived from a red alga (Hackett et al. 2004; Janouskovec et al. 17

2010; Kahn et al. 2007; Yoon et al. 2002; 2004; Sanchez-Puerta et al. 2007). How 18

these plastids were acquired is a contentious issue but most recent models reflect a 19

growing consensus that multiple independent origins and/or serial endosymbiotic events 20

best explain most recent data (Bodyl 2005; Bodyl and Moszczynski 2006; Sanchez-21

Puerta and Delwiche 2008). 22

Page 27: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

17

The understanding of the evolutionary history of chromalveolates has recently 1

been further complicated by the unexpected discovery of so-called ‘green genes’ in 2

chromalveolate genomes. Whole genome sequencing and EST studies have revealed 3

that the genomes of chromalveolate species encode 100s or 1000s of genes apparently 4

derived from within the green algal lineage (Moustafa et al. 2009; Tyler et al. 2006; 5

Woehle et al. 2011). For example, the genomes of the diatoms Phaeodactylum and 6

Thalassiosira reportedly contain thousands of genes whose phylogenetic affinities lie 7

within green algae (Armbrust et al. 2004; Bowler et al. 2008; Chan et al. 2011; Moustafa 8

et al. 2009). Putative ‘green genes’ (albeit fewer in number) have also been detected in 9

the genomes other chromalveolates examined (Cock et al. 2010). The presence of 10

‘green genes’ has lead some authorities to speculate that the last common ancestor of 11

the chromalveolates once harbored a green algal symbiont that was later replaced by a 12

red algal symbiont that gave rise to the chlorophyll a + c-containing red-type plastids 13

that characterize most extant chromalvelates (Armbrust 2009; Dorrell & Smith 2011; 14

Frommolt et al. 2008; Moustafa et al. 2009). In short, the green genes found in 15

chromalveolate genomes are hypothesized to have been obtained via endosymbiotic 16

gene transfer (EGT) (Huang et al. 2004; Reyes-Prieto et al. 2008; Slamovits and 17

Keeling 2008; Tyler et al. 2006;). 18

Other studies – implicitly or explicitly – imply that the green phylogenetic signal in 19

chromalveolate (particularly diatom) genomes may be more apparent than real. Biases 20

associated with heuristic phylogenomics pipelines needed to construct across genome-21

level trees and the uneven distribution of protein sequences for eukaryotic taxa have 22

been previously described (Stiller et al. 2009; Woehle et al. 2011). In this study, two 23

Page 28: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

18

chromalveolates, the nonphotosynthetic stramenopiles Achlya, Thraustotheca, were 1

canvassed for proteins of putative green algal origin. These proteins were annotated, 2

combined with homologs from other oomycete genomes or expressed sequence tag 3

(EST) databases, and homologs representing all other available eukaryotic taxa. The 4

phylogenetic trees obtained were used to (1) determine if nonphotosynthetic, aplastidic 5

oomycetes encode green algal genes similar to those found in diatoms and other 6

chromalveolates. Note, that if oomycetes are ancestrally non-photosynthetic then their 7

genomes should not encode ‘green genes’. (2) Second these trees were used to 8

critically reassess the veracity of green genes found in chromalveolates in toto. 9

10

METHODS 11

12

The genomes of Achlya hypogyna (ATCC 48635) and Thraustotheca clavata 13

(ATCC 34112) were sequenced and assembled yielding 17,430 and 12,154 predicted 14

proteins, respectively. Green genes possibly obtained by HGT or EGT events were 15

identified using evolutionary gene network (EGNs) analyses as described in Bittner et 16

al. (2010). In brief, all sequences were BLAST-ed against one another. Sequences 17

were connected in the EGN connected components graph when they showed a 18

minimum similarity, BLASTp score < E-value threshold, and sequence identity score 19

and BLAST identity percentage equal to or exceeding user determined limits. For 20

example, an EGN network with user defined parameters of ‘1E-20 at 80% similarity’ 21

connects sequences that have BLASTp scores below 1E-20 and sequence identities 22

equal to or greater than 80%. 23

Page 29: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

19

In this study batches of networks were separately constructed with minimum 1

threshold protein identities of 35, 45 and 65% and E-value thresholds of 1E-20. 2

Networks including oomycete proteins and one or more protein sequences derived from 3

representatives of (1) the green algal lineage (GAL) or (2) Fungi were selected for 4

further investigation. Annotations for candidate HGT/EGT proteins in the Achlya and 5

Thraustotheca genomes were then refined using NCBI’s conserved domain (CDD) and 6

KOG databases (Marchler-Bauer et al. 2007; Tatusov et al. 2003) and then used to 7

drive BLASTp searches aimed at recovering more distantly related eukaryotic homologs 8

from GenBank. Homologous sequences from representative all available eukaryotic 9

lineages were selected and aligned using “Geneious Alignment” with default settings in 10

Geneious v5.5 (Drummond et al. 2011) and manually edited as necessary. Thus, each 11

protein alignment included all sequences in the EGN of interest, as well as a number of 12

more distant homologs from other eukaryotes. Maximum likelihood trees for each 13

protein alignment were constructed using PHYML (Guindon et al. 2010) with the WAG 14

substitution model (Whelan & Goldman 2001) to account for heterotachy and 500 15

bootstrap replicates. Baysian posterior probabilities were calculated with using the Mr. 16

Bayes plugin for Geneious and run with default settings using the WAG substitution 17

model. 18

19

RESULTS AND DISCUSSION 20

21

Because they are ancestrally aplastidic, oomycetes are a perfect foil for 22

examining the hypothesis that chromalveolate genomes harbor varying numbers of 23

Page 30: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

20

green genes acquired via EGT from an ancient green algal endosymbiont (Dorrell & 1

Smith 2011; Moustafa et al 2009). Genes of cyanobacterial and/or red algal origin were 2

originally reported for the genomes of Phytopthora ramorum and P. sojae but it has 3

since been demonstrated that these genes are very unlikely to reflect cyanobacterial or 4

red algal contributions to these genomes (Tyler et al. 2006; Stiller et al. 2009; Woehle et 5

al. 2011). 6

In this study 12 protein-encoding genes encoded by the Achlya, Thraustotheca or 7

other oomycete genomes were examined, which, based on EGN analyses, are closely 8

related to genes found in green algae (Table 1). Three exemplary EGN networks are 9

depicted in Figure 1. These networks indicate that Phytopthora spp. share one or more 10

copies of the phosphate dikinase (PPDK) gene in common with the green algae 11

Chlamydomonas and Volvox (Fig. 1a). The PPDK gene is, however, absent from the 12

genomes of Achlya and Thraustotheca and this observation – coupled with the current 13

understanding of oomycete systematics – implies that PPDK was likely acquired in the 14

Phytopthora lineage following the pythialean/saprolegnialean divergence (Beakes & 15

Sekimoto 2008; Sekimoto et al. 2009) In any event, the PPDK network clearly 16

demonstrates a putative green algal gene in Phytopthora spp., that is unknown in other 17

oomycetes. If Phytopthora spp. PPDK genes were acquired via EGT, then this 18

observation is most parsimoniously interpreted as a recent event – not one that can be 19

associated with the presence of a ancient green algal symbiont. All oomycetes 20

examined encode single copies of eukaryotic translation initiation factor 5B and an 21

aldehyde dehydrogenase whose most similar homologs are putatively found in the 22

bryophyte Physcomitrella patens (Fig. 1b, 1c, respectively). 23

Page 31: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

21

Maximum likelihood (ML) trees for six of the 12 oomycete proteins of putative 1

green algal origin examined in this study are depicted in Figures 2 – 7. These six were 2

selected for demonstration because they are the most taxon replete and best 3

supported; trees for the remaining six proteins are equally problematic, or worse (see 4

below). 5

A tree comprised of DEXDc homologs is presented in Fig. 2. The EGN for 6

DEXDc implies a green origin for this gene in oomycetes, specifically uniting oomycete 7

homologs with the sequence for Chlamydomonas reinhardtii (not shown). Note, 8

however, in the tree that the C. reinhardtii DEXDc terminates a very long branch and 9

that when other eukaryotic homologs are added the oomycete/green relationship 10

becomes less clear. In fact, this tree implies that oomycetes share a common ancestor 11

with the Opistokonts (fungi and animals), a result clearly at odds with current 12

understanding of eukaryotic systematics. In summary, (at least) two phylogenetic errors 13

are apparent in the DEXDc tree: long branch attraction and a topological error that can 14

likely be traced to problems associated with taxon sampling, i.e. clear homologs to the 15

algal, plant, oomycete, and fungal DEXDc genes have yet to be identified in other 16

eukaryotes. The same issue – taxon sampling – specifically the differential distribution 17

of homologs among eukaryotic lineages also plagues the RPB tree (Fig. 3). Bearing in 18

mind that protein sequences for animals, fungi and plants far outnumber those available 19

for other organisms, the RPB subunit II tree implies that the alveolates are sister to a 20

clade including stramenopiles (brown algae, diatoms, and oomycetes), animals, and 21

green algae + land plants (Fig. 3). This topological error is likely compounded by the 22

observation that the alveolate sequences terminate long branches whereas the 23

Page 32: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

22

embryophytes terminate shorter branches, and heterotachy is a well-known source of 1

phylogenetic error (Kolaczkowski and Thornton 2008; Pagel and Meade 2008; Philippe 2

et al. 2008; Shalchian-Tabrzi et al 2006). The ALDH tree implies that the stramenopiles 3

are not monophyletic; green algal sequences are nested within a clade including 4

sequences for alveolates and stramenopiles (Fig. 4). 5

These same types of phylogenetic errors are demonstrated in Figures 5 – 7 6

and are not repeated. What these trees clearly demonstrate, however, is the pervasive 7

influence that the vast number of sequences available for fungi (80+ complete 8

genomes) may have on phylogenomics studies (cf. Stiller et al. 2009). The TOR-9

containing kinase tree suggests that green algae may not be monophyletic and that 10

green algae and stramenopiles are, again, sister to animals and fungi (Opistokonts) 11

(Fig. 5). The unorthodox relationships among green algae, oomycetes, and fungi are 12

also recovered in the YAK1 tree (Fig. 6). The ALS tree is equally vexing and seems to 13

suggest that the chromalveolates (in toto?) may have obtained their copy of this gene 14

via horizontal gene transfer from fungi (Fig. 7). 15

16

Green Genes in Oomycetes and Other Chromalveolates? 17

18

On the basis of the data collected, the notion that chromalveolate genomes encode 19

hundreds or thousands of genes derived from green algae is false. 20

Critical analyses of protein-encoding sequences from oomycetes and other 21

chromalveolates of putative green algal origin yielded trees seriously compromised by a 22

number of obvious and well-known sources of phylogenetic error. These included at 23

Page 33: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

23

minimum biased taxon sampling, long branch attraction, and heterotachy. This 1

argument is bolstered by the curious fact that so-called ‘green genes’ can be detected in 2

oomycetes even though these organisms are ancestrally aplastidic. These results, and 3

those of Stiller et al. (2009), suggest that these biases are so prevalent at this time that 4

broad-scale evolutionary scenarios drawn from phylogenomics studies need to be 5

interpreted with a higher level of skepticism. 6

7

Page 34: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

24

Table 1. List of 12 annotated proteins from the Achlya and/or Thraustotheca proteomes or other 1 oomycetes found in EGN connected components graphs clustering with homologs from green 2 algae. 3 4 Protein Annotation

TOR-phosphatidylinositol kinase

phosphatidylinositol kinase, putative target of rapamycin (TOR)

Yak1 PKc-like superfamily, Yak1-like protein kinase

acetolactate synthase (ALS or AHAS)

TPP_AHAS[cd02015], Thiamine pyrophosphate (TPP) family, Acetohydroxyacid synthase (AHAS) subfamily

DEXDc DEXDc superfamily, premRNAsplicing factor ATPdependent RNA helicase PRP16 putative

RPB RNA polymerase beta subunit.cd00653: RNA_pol_B_RPB2

RRM RRM superfamily, PREDICTED: cleavage stimulation factor subunit 2-like

RRM2 RRM superfamily, PREDICTED: similar to RNA binding motif protein

Sm_D1 Sm-like superfamily, small nuclear ribonucleoprotein D1

Sm_E Sm-like superfamily, small nuclear ribonucleoprotein E

thioredoxin peroxidase thioredoxin-like superfamily, cd03015: PRX_Typ2cys

threonine protease threonine protease family T01A putative, cd01911: proteasome_alpha

ALDH ALDH-SF superfamily, cd07084: ALDH_KGSADH-like

5 6

Page 35: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

25

1

2 3 Fig. 1. Three examples of putative green genes in oomycete genomes based on EGN analysis 4 conducted at 65% protein identity. (A) All species of Phytophthora in this analysis share a copy 5 of phosphate dikinase (PPDK: P. infestans gene ID 03724) with Chlamydomonas reinhardtii 6 and Volvox carteri, two microscopic green algae. Note that PPDK is not encoded on the Achlya 7 or Thraustotheca genomes. (B) The moss Physcomitrella patens shares both eukaryotic 8 translation initiation factor 5B (P. infestans gene ID 20386) and (C) an aldehyde dehydrogenase 9 (P. infestans gene ID 00034) with all oomycetes included in this analysis. 10

11

Page 36: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

26

1 Fig. 2 DEXDc ML tree: Oomycetes are shown sister to animals, sharing a common ancestor 2 with fungi. The phylogenetic errors demonstrated include long branch attraction and topological 3 error due to sampling bias. 4

5

Page 37: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

27

1 Fig. 3. RPB ML tree: Alveolates are shown as sister to a clade including stramenoplies, animals, 2 and GAL. Long branches in the alveolate clad and short branches in the GAL, stramenoplie 3 and animal clade is indicative of topological error due to heterotachy. 4

5

Page 38: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

28

1 Fig. 4. ALDH ML tree: Stramenopiles and GAL shown as not monophyletic. Long branch 2 attraction between GAL, stramenopiles, and alveolates is likely responsible for phylogenetic 3 error. 4

5

Page 39: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

29

1 Fig. 5. TOR-containing kinase ML tree: Stramenopiles are sister to GAL, shown sharing a 2 common ancestor with animals. Heterotachy and topological error due to sampling bias are 3 demonstrated. 4

5

Page 40: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

30

1 Fig. 6. YAK1 ML tree: GAL and stramenopiles shown sharing a common ancestor with fungi. 2 Long branch attraction between GAL and stramenoplies, heterotachy and topological error due 3 to sampling bias are demonstrated. 4

5

Page 41: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

31

1 Fig. 7. ALS ML tree: Two clade tree shown making inferences about the relationship between 2 the two impossible. Phylogenetic error is likely due to abundance of available fungal genome 3 data (sampling bias). 4 5

6

Page 42: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

32

SUPPLEMENTAL INFORMATION 1

Table S1. Selected hypothetical proteins (n=16) from the Thraustotheca clavata 2 genome possessing putative 5’ transit peptides. Chloroplast transit peptides predicted 3 using ChloroP (v.1.1) are shown in bold face. Transit peptide sequences predicted 4 using SignalP (v.4.0) are underlined. 5 6 >THRCLA_02069 7 MVRISALLGTFALIHAQTTTAPPASASNSWTMTTVNSIQARVVSDAATWDATNKKFG8 LVMKQNTVTFPDQYRAAMDTVNTASVEGALFYVQTEGINKQFDVNCMRKTNMSYIWF9 LNVTIVQPTFAIAEYADNGGVVPEYGKFIAMDNGQCTPLDTKGTMSDECMTLGGLNYH10 ANIGPFIGGEPRKEHLLAKYPDNIWFSYPNSCFTKTFIAKDTKCREAQKGGLCPLGVQP11 DGIKCTYSFDILGYIRIDELVGITNLTNSQTGQKYKDRVEFCKDSKVEFDFSTMKSDLTF12 WDNPTDEAANTNRTTKMLELYNNLIKTGTGDAAYMKSLPTAAELTAKNPPCWKNSPIC13 ATAEFGCRRKLTAQICEKCTSASPDCKKPTSSDSVPPKLTKAVAPPLPTDASGKTTVP14 RNPTGAGGNGNAAAAESSASSLVAFTSLIITLAALFA 15 >THRCLA_03737 16 MKSTFVLLAAISLVNASSSTKLRGAAPCPNSNSGSSDNSSDYSGSESNWDSGSGSD17 WDDCGSGSTSTSDSGSNDYPSNWDSNSGSDTTEEPATYAPAPTSAPTSAPTETPAT18 SKGTLKEQIIHQTNLIRAAHGLGPVKWNDELAAKMQAWANSDPQQNGGGHGGPPGN19 QNLASFDVCNDNCMRMTGPAWAWYSGEEKLWDYDANKSRDGIWETTGHFSNSMDP20 GVNEIACGYSTFYNPQIGHDDSLVWCNYLGGNNGVIPRPRIDQATLEKQLTSAY 21 >THRCLA_03876 22 MNLKAWILSVAIASAAAASGSSSGSGSTTDAPLTQENLSSRPGLCNTSKDCAKYTKG23 SNVYSCIAVKSNIVNLTTLKQCVLGDGCSGGKAGSCPTFTSWPQKFRQVQPVCAFVA24 VPNCNSAVNSQGQVVSVRSLREQAAKPGNVTCFQAKFGSNSSSSDDSATVYGIYQCV25 DKKLYAEKNLGYLDNTPKQLQSCAGNVTVVNGQSVSNVLCNGHGTCVPQTDFSDIYK26 CLCSTGYSDKDNCGAATGNVCSAFGQCGNGNCNPDTGKCVCPYGSTGDQCSKCDP27 AQNNNASVTNMCNGNGKCGIDGTCQCSDGYLGTNCETQIKKNSTASSATGSTTSSKK28 SAASGLHEASIAIFSIATIFAAALI 29 >THRCLA_04285 30 MQIKSIIATLTLAALAQADNNNCEKSCTKELSPLCASNNETYNNLCLFQIAQCQQPTLTI31 SANQSCSTNVKFCTRLCPTVYQPVCGSDNTTYPTECDLKNKACNNPSLTVTKQGACD32 NCPKACLEILAPVCGSDGKTYDNTCFLLKTACANPSLNLTFVSTGSCTNGNNTTTTAPP33 SGTTLPPSGTTLPPTTSGNPSTTTTPPTTKPASSATTAMLSLMSAAAIAITYML 34 >THRCLA_04386 35 MKWQVALLSLVTSGIAQDHCGSTTVPTIVPTPAPTLAPTPAPTPAPTPAPTPAPTPAPT36 PAPTPAPTPAPTPAPTPAPTPAPTPAPTPAPTPAPTPAPTPAPTPAPLAVATATWTNLW37 SDIVQVATDNTQICIRETNGDVDCKPWSTDSSLPTVYGGHSSNFLATGGGWSISTVNN38 VNYLVVISPLYNANVMVLDEAILYAATDGATCCITTSTFRCASQKLDMTFVKMTDKYITS39 SSIYNAVIYGVDAQGKLYKGSTASISTGVANWQEVSTPCPFTQVSYDGTTLCGLYAST40 NTIVCTSGTLSLQPNWVALQSNKWKQFSITQSYIYAVDTSNNVQRLQISQPIAVAP 41 >THRCLA_04952 42 MTLASSPTFSRPLLLPPLTSALSPSIAQQMKRQHECEGGGSVKRHCSTFPYMEMPRL43 PSITQPSSHIGYLSESYYPSPTSLPMLPPASTLLQQATRKSMDLVPSNAYAPTLPEPCT44 LYKSNENTKPSPSNEEVRGECLDAQCHNSVKHRGYCKLHGGARRCDVPGCPKGVQG45

Page 43: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

33

GNLCIGHGGGKRCRFPGCSKATQSQGLCKAHGGGVRCKYDGCNKSSQGGGFCRRH1 GGGKRCSVAGCPRGAQRGTTCAQHGGKAQCMIDGCVRADRGGGYCEVHRKDKVC2 RQGYCNRLARIKCEGYCTQHHREFCITSPPQ 3 >THRCLA_05863 4 MGSVLVLLFSPLHAWLTSSSSSSLPCLQTSFLLSQNASLDQIATAQPRINDAIVSFLAQ5 SNVQSRIQWTNVDITTSTIDGNGPAIVQMCLYVPPNTSVNQVAMAMSATVNWSGLKTS6 ISTLHRRFQLFDLTTPLQVLNQVTQYNFQRIPFPYVQWYLVVQGRFDYFWPQIKIKHAIA7 VLLNISSSSVIPQDIIFPPYDAYNDIATILPFAITQVNSSTFARTANTLTGPLQDILALHGILL8 LTQFPDPNGNGKLQQSVPWPEYPQLDPTSFYPFHNWTPVPNSFVVKLIYGGLLTLTN9 MSSVILQVLDVLDSPQTANFTDFQTLTLTYPPNNGTATFESSRYNTLDFIVAGDRSTLE10 ANQQTLGESLYQIGVSIFDVIDINSTMQTAQWYPYMQLDCPYNLSALASIIQRIALAAFF11 SIPLSSIQLIEIATNSTTFEIACNDTLEQRYLKKQLKETTRWSTVMNNFTANSAFCTIGGE12 SLAYPPMFPGSTYGWSQPSSSMDNTCSVNTIELTACDQCDRYLNAVCFTNPNCYQTQ13 TTLLSQLLVSSNASSVFQQLSLSTSANTKTLNTLALYYSCIAAFQCLIAPNTSIITSDEVYT14 IDINANGANFSTTLYYPQDDIYLVLNDQTTLEEIQINLSSSISNSIFVNVSGTSSSFNVTM15 DSVVIPFQLPVIAYSTVPATIQRISASIPQLVFLSNSSNDTTVLLNGKCTTCLTQMDECK16 MSPSCPSIAICWSNVVESAISQLDSVYSTLEISTQLISCYENASLEDFEMFLRVQKCLLQ17 SSCPISPTLESIVKGTMIVLRSTTGFQTIELTPTPAVTLTIGTESIILSSNSISGLQATMINFL18 SPLCQASIQSNTANLTIQFNDFGAPILPTINGTIYSQMPRIFLDRMPLDSSRFGFSYQSY19 KQLSPSSLPNAFTTTLNSNCQMCQNLFDQCLLSSFCASIISNFQNTIAGATNAFIGWSV20 ALQRLSFDIPEWDQFAQTLSCFEIHNCPINSTISMLKNGRMLLLSSTPVVLSVTFSSSPF21 EAAIYVQRFRQPINVSSNSSAAYIQGQFQMNFGSLALTNVSITNTSMELSLNSYYGPTP22 EFMVTSSEFSNKTIILGTSMVSVVSYSPAAYFPY 23 >THRCLA_06099 24 MKFALVSSLAVLASAQTNNSSAGSNSNVNCPLQFTSACANTQECGTLNGYPLECQV25 YGSVKQCVCSKENANCQNSTNIANTIPQFGVCTGGKQCAGSGFKALQTPVRTCSEQL26 VCIPQYASGNELQSICHTCSSCKQQNKPDATGRLIFNCTQICPLGQGDPIVTIPPVTTAP27 TNSTKKNDSSKGSGSTAGSKPKSAATSIVAGVATVAIVAIASLF 28 >THRCLA_07047 29 MILINLLFGLRLCTDGVSLLQQQVPRKPSKRTKQSRCKHVPFVASTALKPTHETLAPL30 MPLVVYQEVTENDMAHLISLVDNQDNQEDNEEITENVVADVFVPLVDNQVSQEANEN31 VVVDFADNQDNQEANENVAEFEPLVNYQDNQENVAEFVSLVDNQGSQEASENVVVE32 LVPSVVCRDSFEPTEEDVAAVLHGRFAANQAALLRVSSFQPADDRSLTAIQLIRYFELY33 HLVRMDYNQLRHLEPSRLEKIQLVRLSILERQAIEAMLSDVAELWSRQPNDVSSAKKLQ34 WFKNLQYGLMWDMLELLEHQKPDHHCARGLCPQLYQEKLDIIYSE 35 >THRCLA_08011 36 MKTIFLTTALLASTSCALQMTNKERNEILDELNKWKQSAVGKAALVHNFLPSSQRQEG37 LSIDAKQDLEITRFAHTKKVVEQLNKEHKGSAVFSTNNMFALMSDEEYKKWVKGAFGR38 DHKKRQLRGENIQLELTAEQREASGIDWTSNKCMPAVKNQGQCGSCWTFASVGAAE39 MAHCLVTGNLLDLAEQQLVDCASDAGQGCQGGWPTKALQYITQTGMCTSRDYPYTA40 SDGQCNNSCKKTKLSIGEPVDIQGESALQSALNKQPISVVVEAGNDVWRNYQSGIVQQ41 CPGAQSDHAVIAVGYGSDGGDYFKIRNSWGAEWGEQGYIRLRRGVGGKGMCNVAE42 GPSYPSMSGKPNPDGPTDEPSNDPTDEPSNDPTDEPSNDPTDEPSNDPTDDPSDDP43 TDDPWNGSNDWDWGN 44 >THRCLA_10855 45

Page 44: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

34

MIVPVVLMALTGISITGVTLRWCCSHRQKTSWKEERKEPLLATPVPLPRIKTDVFIERSI1 AMDVPLMETCSGCGAWIDPSLAAIANGLCVVCSYQTIPSLEIDIDENISENDETSNDKDS2 DKPILTDEDTTADIESPKDVEIQIEESNFEDEMEDISTTSQDMVIPISQDNNCNEGDEDV3 EIAEEALALVQDMWDIAYQAHLGVGDDPTADIFVEMALDLDATAEAIKKEPHLLSESFH4 FLSLSLASLMELVPEAWVAHVEATELKFEALQFRYHSKLTVENCLDLATHLYELVECAQ5 EFGVDPAVASSLMDGLEELVEAIEETPCELVSWLAYLAATVKLLKSYQRDFEQAEMWD6 TVVECERNLEPLEMHCWEIYSPC 7 >THRCLA_10997 8 MKASLCIATLAAMGSIASSRNIRHHAESVMGNPVQRRSESTRLPTHPLTGYWHDFPN9 PAGDTYPLTQITKDWDVIVVAFANSLGSGKVGFDVDPKAGSETQFIKDISTLKAAGKTIV10 LSLGGQNGAVTLNDATETANFVSSVYDLIKKFGFDGIDLDLENGISKDLPIINNLITAVKQ11 LKQKVGDSFYLSMAPTYGGIWGAYLPIIDGLRNELTQIHVQYYNNGGFVYTDGRTLNE12 GTVDCLVGGSVMLIEGFQTNYGNGWKFNGLRPDQVSFGVPSGTSAAGRGFVTPEVV13 KRALTCLVQGVGCDTVKPPKTYPTYRGAMTWSINWDSHDGYVFSRPARQALDSLGG14 SPPQPNPTAVNPTDAPNPLTNPPTSRPTNTPTVTPTQSPRPTSQPTSLPTSSPSSVPTI15 NPTPIPTSVAPQPTQAPSSSC 16 >THRCLA_11248 17 MANTIQWLFIYCVIVASQGPPNNGERTCSVTLGGPVSQTSTAGTMSFCTAFPQERCC18 LPVHDEYVKSTFYALLDSGYICASATNTAIAHLQTMFCLACDPSMSLYLTPPRNTTFFS19 APQTLKVCRALAISFKQHIDAVSPYYFSDCGLTYAGDRNNLCIPKTAISPNMVFPGCSE20 GQNICYSTTQGYYSPIWYCSSSPCGPDTPFGLNDIPCSGPTCTPAFQFLNDNRAAKPP21 FFEPFAVEIIDESTCAPGESSCCMTDSSIVPTS 22 >THRCLA_11271 23 MKTTAFVLALASTAAASSPCTGSAVITAVTPLIAQATTCSTDSGFDLVALISGTTPTDA24 QKQKFLTAESCKTLYASVQKSLAGITPACTIGDIDTSGWSTVSMDKGLDALIKSLPSLLA25 SSGATNSTSNSTANSTISSTTVSPSSTTAAPAKSGVAATGVTIAAVALTTAILHLNANKQ26 QEIHEHLRLTIKESDVETLGEVMSMSLIPAAEAHQFI 27 >THRCLA_11391 28 MKLSILLAAFGVVASSSIPKHTYKCNDGVCVQTPLNGAGVSLGSPLLSLRMCEMTCG29 AGSLWPYPASVSLGTTATAIDTNKVSHSIKINGAEATSTLTNSIVQTFNEGVKAKTKWV30 RGQSEIGAISHSIYGTISSNNEVLGQDTDESYELSIDGPRVKINAATIYGYRHALTTLNQL31 IDYDELTNSVKMISKATISDKPAYSHRGIVLDTSRNFYPIESLKRMIDTMGANKLNTFHW32 HMTDSSSFPIEINGEPRLTTYGAYSAEQIYTQDQIRDLVQFAKARGVRIIPELDAPAHAG33 AGWQWGPKAGYGDLTLCYGADPWMNYCLEPPCGQLNPLNKQVYSVLDTVYKELTSL34 FDGDVFHMGGDEVSIPCWNSSKVITDHLKDTNKPGAFFDLWGDFQTKAAAMLNKKVM35 VWSSDLTTDPYLKYFEPNNTIIQLWGGSTDGDATRITSQGYDVVASYWDAYYLDCGFG36 GWVSKGNGWCAPYKSWQVIYDLDITANMTAANAKHVLGSEVAMWSEIADAHVVETKV37 WPRAAALAERLWTNPKTDWKSAMGRMRIQRDRIADAGIGADAVHPLWCRQNPGKCQ38 LV 39 >THRCLA_11516 40 YTCVAVQTAIAGIALASQCVLGTTCGGNSAGQCPTFSSWSSSYQKIQPVCAFVNVTN41 CVNFIKAGSEAKATSGSGSTSTVNCYQATFSANNISQVVSGIYKCVDSGLYVSQNLGAI42 KNLTTTQMDVCAGNLTTSVGALCNGHGTCAPTAAFSSKYQCICNEGYSATDNCNVAT43 SNVCNAFGSCGAGNTCDTTSKQCSCTTGTTGPQCSLCDPTASSSVVCNGNGVCSSS44 GTCTCNSDYTGSLCSRTATTNSTGSNKSSSSSHLVASLATIATCLLAILM 45

46

Page 45: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

35

1 2 Table S2. Selected hypothetical proteins (n=30) from the Achlya hypogyna genome 3 possessing putative 5’ transit peptides. Chloroplast transit peptides predicted using 4 ChloroP (v.1.1) are shown in bold face. Transit peptide sequences predicted using 5 SignalP (v.4.0) are underlined. 6 7 >ACHHYP_00269 8 MVRTLSLLLLAAGVAGQTSTTPVPTPAVSNPPFTMTLVNSIQARVVAEAATWDETNQ9 KFGLVLKQNTNTFEERYRAVMDTVNTASVEGALYYVQTEGIDKPLQTGCMRKTNMSYI10 WFLNITMVQPTFAIAEYQDNGGVVPEYGKFVAMDGGLCTPVGTETPLECLTYGGLNFN11 KNLGQWVGGEARKKNGRANYDDNYWFSFPNSCYTMRFDAKTKACRDLQKGGLCPIG12 TQPDGVKCTYSFDVLGYLAIDDLVGITSMKNTLTGQNFKGFSEFCKAGKTEYNFADSS13 SDLTFWNDPLEPAANANRTKVMMQKYNDLVQNGVGDQKHMKALPSVEELTKANPPC14 WKNSPRCATAANGCRRKLLSQICEVCSAPADDCKKPGPNDKAAPMLNKQFQPALPTD15 ATGNTKQPRAPNAAPLDAPAGGAGGNVIKGSGAAATSLILATAVGLVALAV 16 >ACHHYP_01095 17 MLARLAALIGVAAALQVPFTTYECVRGRCEPRPRSFSPPDSASSLRLCEMTCGAGNL18 WPLPTSVSLGTTTRVVSVDYVSHTVTFLDNSVPISPLVGAIQRIFDNTLALKATECALAS19 VGGAELAVTASIESGNEVRDYFRTFTMAADDNTMVQELELETDESYTLTIVDGAATIHA20 ATVYGYRHALTTLSQLIEYDELSHDMHIISAVTITDAPHFAHRGIVLDTSRQYYSVPAIKR21 LLDGMGATKLNSFHWHFTDTASFPIEIKGEPRLTAFGAYHPRSVYTQQAMRDIVAYAR22 ARGVRVIPEVDAPSHVGAGWQWGKDAGLGELAVCFGHNPWTEACVEPPCGQLNPF23 NPHVYDVLETVYEELNEIFDSDVFHMGGDEVHLGCWNMSAAVTAHMTDRSPDAFYRV24 WGRFQMQARQLVGEKKIAVWTSDLTNAPYLRKYFDPASTIIQMWTLSTGSDAARFTA25 QGYPVIASYYDAYYLDCGFGNWLLKGADWCTPYHHWSVLYDLDVLHNVPAAQRNLVL26 GGEVALWSEEVDEATMDAKIWPRAAAAAERWWSNPVNGTWKDAIDRMRIQRDRLVD27 IGLQADALQPLWCRQNAGDLSQGSGISISATVKSKSEALTVDTDESYELSIDGPKVSIN28 AATVYGYRHALTTLNQLIDYDEISNSVKMIAKAKIADKPAYSHRGIVLDTARNYYSIDSLK29 RLVDTMGANKLNTFHWHFSDSSSFPFEIKSEPRLTSYGAYSKDQVYTQDQIRDFVQFA30 KARGVRIIPELDAPSHAGAGWQWGPKAGYGELTLCYGSDPWMDYCLEPPCGQLNPL31 NDHVYDILKTVFEEMHGLFDSNVFHMGGDEVSVPCWNSSKVITDHLKNTTSNAPFFDL32 WGTFQTKAGALIEKANKKIMVWTSDLTTDPYLKYFKPSNTIVQLWGGSTDGDAERLTS33 KGYEVVASYWDAYYLDCGFGGWVSKGNGWCAPYKSWQVIYDLDVRANLTATNAKRV34 LGSEVAMWSEIADEKAVEAKIWPRAAALAERLWTNPKTNWKSAMTRMRIQRDRIADA35 GVGTDAVHPLWCRQNPGKCTLV 36 >ACHHYP_01226 37 MTALADAVWLAVMAFLDGQDLSRLMRVSRAHWRRLQAQVRRWREIQLGLGLGHWV38 QRNVRLTINTQVQEAQSLAVQRSPDARVPPRVETIQKELGPIEAERSVHRLTATTPLFT39 ATQQAVLVLSFDCTSADTKPLLVHTSQRARTLYTTLTLTIFDRTLRRHVYHKASGDLAT40 VPVAEKQAWTNAGATLRCDVASNDKSCQVQLGLPARLDGKIDCYHIERVDFTLHKREL41 YPVFSLPLEPSLPTCWIHLQFHDLARAQCLARVSAPCHALLEMAASRTDDTNHPARRT42 AVEQLEVATFRSTQPTSLPDISSLAKPGMISMVISGPERHQAFYHTAFGHSGATRKSDS43 AHVLAATWVPGVLEFAMYPDTLNRRVLKGIFTLEFAVSGALTSLVVLAQHLSPRRLLRY44 NARVASYSRRPEAERNEDA 45 >ACHHYP_01546 46

Page 46: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

36

MVALFLGTAIALALASSATGSFTGLAMPAANSSEPKSGQCKLMKLLPRATQFNVALS1 PRHYGRGGHCGRCVQTQCDRCAASAPIIAQVTDRASDVGLSKPMLRALFGSGAPSAV2 TWDFVDCPVNDPIALCTKPRNTSAYIIYVQPTNTVAGVQNMTIDGFRGRLTNASYHFKA3 PMPANWSNVRVSMKSFTGDAIAASVALRPGRCVTIPHQFSPSPAAASGTPAVIDYDGD4 EDADSITVPPPYK 5 >ACHHYP_02169 6 MAWIVVLGILAHVATALQSSLCATSSAFSPPGCHANRRLATWSRAIVRLNAGGHVCT7 GWFVGSEGHILTAHHCIHKARAVEVVVEETPAQTCPPRTIRGRMTTGIDVVAFSVALDY8 ALLRPLNRSVRGPVHLQLHSSAADIVGLEAIVAQHVDASSPVVLSEAGRIVSTTFAGCG9 RRDRLAYALDTKASASGSPILSTATGAVLGLHTCGGIHCHGKSVPMWIVIGCSSEPGH10 WNSGAVAADVVADLRQRHHLPPDAVAHETLSAPTPSTIIVERGRLVQRAANTTSVDAY11 LLTMAMPGRVTLDLLAWTMDAQGRWHDLRRDCDGSFFDTKVILAVVDDADGRPLLRR12 IAENDNDTRHQGMGDGSIDNRDAFLDVYLASPGDYYVLVGTAAMLLPAVFAPRLSAPT13 DGGQHLYGCGNTRATEANYNLRITTDDGTLQRIEAPFPRTAACSSSARKCPAAHADTA14 LTLDAVVAGTLHRTYSSGTSMDHISFELTKAGRIAIDVVSYQEHTNGSIAIDGLHDVCGR15 AYLDTVLYVFGATIPSGEYLDPAALVATASDRPPTHVASQRYRSVSTRDPYVEVDLPA16 GNFTLVVGQQPLSLFEAVRVLYPGSRETDAPLLCGRPHPFGHYHVFFWVQHRRMLSA17 TMPGSFDHAACTHEVCSDSML 18 >ACHHYP_02305 19 MKFTTLLVATVFGQNTTTAPSSAPTPAPTKCLLQFTSPCKSSSECGDLNGFNLTCIKS20 GSNKQCNFNGGSTVAKDNQFKAADNLVYQFGDCSTASCTTGHGFTEGLPTTVTCQE21 PLVCVKEINDNPGVVLKSQCHTCGSCKAQSLKDTRFDCSKVCPLTPAPTTKAPKVPGA22 TGSAASSGSGSETSAPATRAPKTGTPAPTAASSASTALVSGIAVVALAFAQLC 23 >ACHHYP_03044 24 MAGLIVGILAAVGTFSGSGESISTGTSSTPAPTTHTPTTLSPSPTTKPTTVTPTPTLAN25 GLCPLRGMYLSGTSCVACPTPKKTFSVFWESQVDCSTFATSSAAAYVTHIYWSFALID26 PTTGTVSSTFQGSSATLKACIAAARAKCIKNYVSIGGATMRQTFVALNSSAQLTTFALS27 AAQVVQEYGFDGVDIDDESGNLLAGGDWKANALPNVLVYLQGLKTQLAALPRAATEP28 KYQITWDEFPTSLSTGCDLASGDYLRCFDVRIANIVDQVNIMMYNSASSTDYDNFLNVV29 TPTEWATAMPASKIVIGGCVGPIGTIGGCAFGAAPTATQLKAYASLLDPALHERLSRMD30 LGFMLDLARDELLVLLESEQAHNPGVAVREGEGREDKQQQRRVQREVGAEEVDEAH31 VGEERVEGGVRRDLAGVEQQ 32 >ACHHYP_03052 33 MAAVSNPLLPLQLALADLLERPIHAALDDALRQPSNEQHLHHCVRSLPPSATVDALD34 ASLAFVVHARALLTICSDYLDQHIAPQHALKKITDLLSVSREIANDAEVNATADDADVDE35 AATDDSDQFASPKGEPPVGPWSGSETPAAPTSRQSWWAQIWGGDEDNDSAGDDVS36 APPEEETLPSLPVEVANTIASLAAFPTNLKLQLHGLEALVEYVHGPCCCESVGPLYAAP37 DMLPAVLHAISSLAQSKRAQIAGLSLLANPSSPKANMPMLPANLPTQQVRRLILRAMQR38 FKAHAQIQGLGCLALSNLCRGPAISESHALKARGCRLVWSSWLLALICASSGTSMRAH39 PLTGGPEDMQYAVLDAGSVAVVEAASRRFQDDDRVRKHADMALREMLQKHASRRAP40 QCAFQ 41 >ACHHYP_04549 42 MRARAFFVLAGCATAAASPPLPWQSSCQVCAHTGRCGGASSPIKFCGTWPTGACC43 CSANVNCPTPGVHATCDCGFLADYPVDAALPPVADVLGYNFS 44 >ACHHYP_04706 45

Page 47: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

37

MRASVLAIAATVAAAANNQTATTKVFSLEVGTVGVHASRNQDSVLIPCKSNVCVPTG1 SATLEFCRKACNRETGEHDCTTNCACNGTTPGYMCAGICNKAKTADECGSPVFQTCS2 GEDLVPDYECANYKCTNHQRTNYLGANNRCANYERAAYPLPDIHARVRFVKLLNPGE3 AHLIEYYTGLYFGPGQNNANDGFIWNPSVGSIKSISGNSCLDAYVAVDHNVYVHTYPC4 DDSNPNQWWLYDSSLHQLRHKTHSTMCLDADPNDANKKVQMYLCSPGNANQYFDM5 RPILS 6 >ACHHYP_04908 7 MTSVVAVTACLLSWLQRSRASPPVAYSAPNSVAFPAEIVHIKVILSRRRSSVLANGVL8 PPVAPPRRAAHHEGHLAPLTRDLLSDKSGNAPP 9 >ACHHYP_05005 10 MSHCHFAFFVPMLARSLASFTRASRRCFSTEGPFEHRAVSAEVIAELKALYGDRVSTA11 ASVREHHGTDESYHTPSPPDVVVYADSTEEVSKILQIASASKTPVIPFGAGSSLEGHISA12 LHGGISLDLTNMKSVISVEQENMSCRVQCGVTRLQLESELRATGLFFPVDPGADATLG13 GMVATNASGTTTVRYGNMKSNVLGLTAVMADGKIIKTGSKARKSSAGYDLTRLFIGSE14 GTLAVVTEVELRLQGVPEAQKIAVCSFPTIQDAVDTCTVIMQMGIPVARMEFMDHKAIE15 ATNSYSKLNNIVSPCLVIEMNGTPEEIEHHTATVQALAEEYSVQRMSWAATEEDRKELL16 KARHSAWYATMNLVPGSRALSTDVCVPISNLTQVIVDTQADLEASNLVGTIVGHVGDG17 NFHVMLPFLPEDEPAVRAFSDRLVERALAADGTCTGEHGIGSGKIKYLRMEHGDSVDV18 MRTIKQALDPHNILNPSKLF 19 >ACHHYP_05180 20 MYNTADSVAFLSLLTSTVRAITPLPPLQFRVQAKFATGPLPASKPSPSSFISVRFVWNI21 LVRLVVYRRRATPTPVDMAQERTVLA 22 >ACHHYP_05326 23 MHCTFFLSIVTAALAGVAGHVQQRIRSGAVKARGVNLGSWLVTEHFMMPQSPIYQNV24 SADLQPLGEYVVTTALGRAVADPLFKAHRSSWITENDIKEIASFGLNTVRVPVGWWIYE25 DPNDSDWQAYSPGGIQYLDALINDWALKYNVAVLVGMHGAKGSQNGEGHSAPQLPG26 ESHFTDDADNVYTTMQSAKFIMSRYQSSVAFLGLEMLNEPTITPGRVYNIDRTKLIIYYT27 NLYSKLRAICSSCIIMLSPLLNEQYESFGNQWANVLPTGSNNWIDWHKYLIWGFENWS28 MKDIINTGTQWIANDITLWQSRRSAPIFVGEWSLAAAEGILGELKNGTNLNTYANRALA29 AMKEAKAGWTYWSWKVNATDWRSYGWNMQALLRAGVIDLKNA 30 >ACHHYP_05770 31 MSKLSLAFLLHPTALACPPGPEAYVCPLSPETIVCPLSPRVSPASSARAKPKRSPPA32 PRSRPCKEPGCTKYAVTRGHCIAHGGGKRCSVEQCPSGAKSNGLCWKHGGSKTCS33 FPKCSNRSKTYGVCWSHGGGKQCADPNCTKTALRHGFCWAHGGGKRCRTEGCQR34 PAYERNDNLCDVHCAKAS 35 >ACHHYP_06287 36 MQLSHILLFATAAAAQHTLLDSGTPEDRPSSWGSPVTKQIPSAVRFRSSGLCGEAQTI37 DYVDFMVNTDLADIKANATWIGVEICPSVEDVPACPPTSVAEQIPIEVRGKRTTLHWVP38 ATPKVLEPESLYWFIVSSNVENALQAVSWYPGSKRYGTDNDPKSDVASATRMLVPWG39 GMDWVVEPSGGVAPLDHRRVPNAKIVVKA 40 >ACHHYP_06505 41 MIKSFTITATLLASASSLQMTNKERNELIDELNQWKKSQAGKTALVQGLLPPHPKTESF42 DANAKLEAELVRFATTKKVVEKLNAEHNGSAVFSTDNQFALMTDDEFKKYVQGAFGK43 PHKKRQLRGENIQLELTPAQREASGKDWTTSKCMPAVKNQGSCGSCWSFAAVGASA44 MAHCLVSGKLIDLSEQQLVSCASSAGQGCQGGWPNKALEYIAQTGVCTAADFPYTQS45 NGQCKQSCRKNKLSIGRPVDIRGESALQSALDKQPVTVVVEAGNNVWRNYKSGIVKS46

Page 48: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

38

CPGAQSDHAVIAVGYGNGFFKIRNSWGANWGEQGYMRLQKGSGGNGMCNVAEAPS1 YPSMSGSPKPNNDDNNMPDDNDD 2 >ACHHYP_06977 3 MKISRVAVIGLLFVAARSTRAQSTSSSTQSNTETTSTESTPFSSSSSSGPAPIVDVIAAA4 IDAGATPKQAAIVAVAADTGASLAAIIQTAVDAGVSPSIASAVASAANSAAGSGADDVTS5 APITTVADAAVDAGATTAQAAAIANAASSGVSGDDLVNVAISVGVPASIASSVASAAGS6 AAGTPAPIADVIQAALDSGASLNQAAAIAVAVSAGVSVDDIQTAAIQQGLPASVASSIAS7 AVQSTIASAAGSTSADALAANGLGVTSASSTSYVPPSEVTPLKLTGAKDPEAASDVNS8 PEAYSFSAPMTSGSTKSSESPLSGISGMFNNIVALVTSAPSPAEEPKPRLRASCRTA 9 >ACHHYP_07400 10 MKTPAFLASALFAVATGERPACGPDTPSPTMTPTADPTFAPTSGPTFPPTPAPGQWT11 SLGGFAHDISFDGTNVCVKNGDGAFCGFAGQPFDQWKPVATQLKDIEQVACAKGVAF12 VWGRSSGDLVMKTINLKTGEEHDAKMQDGESPRQFSTDGSVVCGTTNSRLFGAKVT13 NGALGAYSTISEDHEIYKTAVAGEFLIVAGYDGALQATLLDAENWDTFSFDVVPVDLRA14 REISTDGVDLCIVTYELDIACSKLSSGLEKWTKVPGEWKTVAVSNNTIYGVDFKSSEIRY15 TYLK 16 >ACHHYP_08323 17 MVAWAWLPAAAAVVAATETHWSHLGNASSDRGLRIHTPITRADLHDEYNDAPVTQR18 RLSGSAASLFRAVAGYGFRGLSNAAIFSGVTLDMCASACVTDARCLSFDYEASTCYIA19 HTDRYAYPADFVPRATSTYYEWQGAAATPTIEPNGGRLTSYGAFQLFTTSRAAAMYY20 QFKSLENGTVTVYTLYSPGTTVTLPEYPCVVQAYTTKAGLSDSIVLVSNAFTVYAARYA21 YLVPFYNGLGFHGLVTRVQLDVQGVKRPRPSRVLEFTDINSTLGIGPFRGQLSTINLTA22 YDARLAGFFDAFTGITTTLCPQVESRVAVSTVTYVNVSLQVFQNASRWVLVPAPLYAS23 APGDLVFSSSVSLVEEYLYLCPHQNAKGHAGVIAKVNLRAFNATSHLPFQPAIEMLDLT24 VIDPSLTGFGSCFANRNYGYFVQRRNAAGLAGQIVRVNLDLFAQPALAVTVLNATTFD25 ARFVGFSGAVVYKNVAYLVPFERNKVGLELNPNYKYFPTPTSSIMGRLDLTTFSTVTPV26 DLSVLDVKYACGYFGGFTVSYYVYLVPNMWTTDTTSPGVNPYHGLVARLNTLTMNVE27 SLDLTLVDPSLKGFMRGFAFGRYAILVPHRNGLTTELPVRLNKSQKNNLGTIVAIDTDNF28 TPSGVRYLDLTLALRSQIPNMPDADLRGFIGGGVSGEYGFFVPYFNGVRFSGKVVRVN29 LRKFGEVQVLDMTQVHTSLRGFTNAVFPQLYEPTVTSLWNYVIPDGTQTPYTFITVDV 30 >ACHHYP_09221 31 MVSVTTPSMTLLGAIALVAGQATVAPTTATPSAPSASPTKGPWAFKSVRTVQARVQA32 DVPVWDAAHKEWVAVFPQNTVTFEQRYRAAMDTINTATVEGALFYVQTEGIDKAVQA33 ANGCMRKSNMSYIWYYDIEVVQPVYSVAEFGQNTGYAPEYGPFIAMDNGMCTPTSGT34 TVPQGCMQFTGLAGNIALGNYIGGEPRTKHQYANYANNYWFSYPNSCFTKSFTAKTD35 ACRNSPMQKGGLCPYGTKPDGINCTYSFSVLGYLSIDDLVGITSTVNPQTGKAFSNHM36 EFCKAGKYEWDFTTSTGLPFWADPLNVTANAARSAKMMDLYTAKVAAGVGEYANMK37 PFPKVSELVAQNPSCSDNSPYCAKQPHGCQRSLLGQICVPCSSASPSCKPPTRAFPA38 LPVATTPPPVTDAAGNVVPMSTNLLGQAVPATSSASTVAFSATAAILVLALA 39 >ACHHYP_09519 40 MIVSAIVFAVLASAAGQSPLKIASSVPYALTIDGSAPVSTVISNTRATSLSVHIASMNLP41 PGATLTIGTVDGKDKVVYTGAHTNLVSDYFIQNKVVVSYAAASYSNNTTPLVAIDKYFA42 GTPDAGGLESICSTTGDLSRPAACYATSEPVKYAKARAIARLVIGGSSLCTGWLFGSE43 GHLLTNNHCINNDRLAASTQVEFGAECASCSDGSNNVQLACKGTIVASNVTLLATSSK44 LDFALVKINLNAGVDLSKYGYLQARDSAPVLNEPVWLAGHPQGDPLRMAVATSNNAE45 GAIVSTNVTDSCKDNQVGYLLDTQGGSSGSPVMSTVDNSVVAIHNCGGCDSETPSNG46

Page 49: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

39

GIPLTKILAYLRANNIALPKNSVSAAPAPTTAKTTTASPATPAPSTSAPATAAPKPPTFTL1 CSVSNKVISEYYTGLYVAPAGHTANEQFSYSPDTGAIQVQSNGQCLDAYWGGSSFLV2 HTWPCDRGNNNQKWTVANNQVMHRVHGVCLTSVAGSKSLGVAPCNAADVRQWIYT3 NCDTANVRNFVQLRTPRGALVSEWYSSVLAKQPQSSWTELWEINGQQMRSFSGSTC4 LDAYWDNSRFQVHTWQCDPTNGNQQWRVGNSVVAHATHSNLCLDVDPTDPRQAAQ5 VWGCHSATINSNQLFDVVAF 6 >ACHHYP_10824 7 MASISQWLCLSCWAPMSTPKTTMATDAWCGTFWKHMLMSVSVTPPPACDCSTDGA8 TALFFAAQRGHSDIVYLLMSAGATAEESTLGISPKQIAQANGHTIVAAIFDTLPPPLPHRL9 HWERSSVLFLSSFLVYRCNLLLLRH 10 >ACHHYP_11025 11 MHARFFAPVLGTLSLVAGSATTLAVNSSRTPQVNAQVRRLSKRALPRDMGKSSTSA12 QAPEGSSKPDMMKDFPIFLFTIE 13 >ACHHYP_11286 14 MASESTPLLALLELPLLKPTSAETIQGHVTALRASFISGAMRPLAARKAQLRAIRALVE15 DGCEILQAAMWKDLHKHAAETFVTETSSVLLEVQDHLDNLDDWAAPHKVGTNLLNLP16 GSSYIRSDPLGVACIMDTWNYPIMLLLMPLIGAI 17 >ACHHYP_11397 18 MDRLLLLSALATAVAVDDAAPRPSRAPLPTTLVPWGSPLAAPTAPCTWGGRAHALD19 WNLTTSVPGSRQCFPNLFAADQPLEFPYPRSSYNYDLDPPVVGPRVQVQWTNGVTN20 VTAPVAAFDYRTFEMTGDELLFHALPDAPGVYRLAVQAFDWDRASSECRACLAVTDQ21 VRPRATVARAGLCGASTTAPYSPEALAAADDRVRALVRYRATATNNDACSDRRCDAV22 TVAQTGFLSAFPTAVVDGANAAVDAVPDGWLGCLAAPLSARERQRLTTPLALVDDAR23 DYFVALQELYTPFRCGAPPGRPTCAGAASETCALMQAVVLPASHLVARVAVKLKATAG24 HIADPAAAFPGAGYLPPSARHLHLAIPCYPTNASFSSFCADTVEWRVSDLFELSAELNA25 SQPWGFDAAAPLVTWFVQQGPAWVAVADNKRLAFDKFQDTLVFRAMTPCGQVGEDI26 AWTVFSHRAEALSVDAWWNSLWSCGGCNVPKADFSVCRFRFDPTSPLVSAMLHPPA27 SCRDAAGRSCRNGCLARGQCNGRSTAASCGQQAGATWCDARGSALLAAAVPRYSL28 RSLQCVWQYANTSSANWSVAVDVAVDTAFALKLRNADATELSVSCTLTFDPDTGEPA29 VVKTRSLALSLRNCDGPRFEDHALAFVKDRCDASWRPGVGRQPAPRQACAGHLVFP30 STTDAAATVLLTPADDLACCSGPVAAFSCQPLPGHPGLKQCQRADTATALLAAEPQA31 WPPVALAASLALVFVLVRRRRQPSDTDLSRPLIDGDRC 32 >ACHHYP_12628 33 MIVQILALAATASAFTKCHIRHPNRTEVLSTPCPHEYVTELPASFDWRNVNGTNFVTV34 SRNQHVPHYCGSCWAFAATSALSDRVRIARERNSEGKDRVLVTRQVNLSPQVLLNC35 DKEDMGCHGGEGLSAYRYIHENGIPEEGCQRYLATGHDVGNTCTAIDVCRNCEPSKG36 CFPQPSYDTYHVSEYGAVDGEAKMMAEIFARGPIVCGVAVTDEFLNYSGGVIDDKSGR37 TDIDHDISIVGWGVDGSGTKYWVGRNSWGTYWGEEGWFRLRRGNNNLGVETDCAF38 GVPADDGWPKRHTETTSPAKAAVWSGEIKSLLQPSRAQAKSRAPVHFVGGEKVLSPR39 PHEEIDVLALPKQWDWRNIAGINYVTWDKNQHIPQYCGSCWAQATTSALSDRIAILRN40 ASWPEIALSPQVVVNCHGGGSCEGGNPGAVYEYAHRHGIPDQTCQAYVAKDGQCNA41 LGVCETCWPTNSSFTPGKCVAVPKFKSYYVAEYGHVRGADKMKAELYKRGPIGCGM42 HVTDKFEAYTGGIYSEKTWFPIPNHEISIAGWGFDEATQTEYWIGRNSWGTYWGENG43 WFRIKMHSDNLGIEGDCDWGVPIPDGSQPLL 44 >ACHHYP_13722 45

Page 50: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

40

MKCFAVLAFAAFAAAATSEQAATTQPATTTAAVTTAAPNTTTVVPLVSTKAPNTTTVTP1 APTTKAVTTVPVTTVKANTTAPVTTVPVVTQTNVTSPDETETPEPVIEQPTDAPLPVPT2 KKKSNATTVPPSASASISMLSVASVAVAVAAYVM 3 >ACHHYP_14385 4 MATSVLALCFSSLTANSTNTPEPKYQTRTVDTVVYESSAKWPKYMGKGSAIQMYTTA5 ALSAQILVSFPETTTVLEKVATVGPLVSLSAVIFFGAKYLGERVITNVTSCRTVGQRGIT6 DAIYLYLDEFLKIQVAGGLRPKTFECYPKGVSALRLISYLKLVSKDENGMCNVKINRTTF7 WLDLGKAQVHQEQSLKILLDGKPLLVRKGKIKKAARA 8 >ACHHYP_15409 9 MGLFAPVLAFATVAVAGSSSTTLPTAPASLSTTRSVPLTDRAALIQELAKWKDSKAGK10 YAAANGFLKLSRLESAGDAEAELAAFAETKATVEALNQQYPLARFSTENPFALLTNDEF11 ATWVSGGRDKVQRKVPEASTTQSTTASIAPGTVDWTMSGCVASVRSQGVCGSCFAF12 AAVAAAESAYCLLHDRHLTPFSDQQVLSCGPGNGCMGGWSDQSLAWMASHGVCTG13 ASYPHTNDWNTTAAACIPECKALSMPYSSVASVAGEHELEAAIALQPVAVDISATSPVF14 KNYESGIITGGCNVDFNHVVLGVGYGVAEVPYFKMKNSWGDWWGEGGFVRLQRGV15 GGVGTCGLARHAAYPVVFPMPFNLVTFRGVVISEYYSNLFASAKQGSVNELWTYDAIT16 RHITVGSNHQCLDAYPTGSSYAVHTYSCDAKNDNQKWVIDSANHAIKHAVHPTLCLDV17 DPNQNNKVQVWSCSPGNQNQWVAVSEERVKLWNVNGNFLASDGNLIQFYSPSSPSY18 EWAVSNLDHTWRARSNVGAPDLCLDAYEPWNGGAVHLYTCDSTNGNQKWIYDAKTQ19 QLRHLTHVGFCLDMRTALGDKAHLWTCNTPANSLQKFQYKSLTFPA 20 21

Page 51: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

LITERATURE CITED

Archibald, J. M. 2008. The origin and spread of eukaryotic photosynthesis: evolving views in light of genomics. Bot. Mar., 52:95--103. Archibald, J. M. 2009. The puzzle of plastid evolution. Curr. Biol., 19:R81--R88. Armbrust, E. V. 2009. The life of diatoms in the world’s oceans. Nature, 459:185--192. Armbrust, E. V., Berges, J. A., Bowler, C., Green, B. R., Martinez, D., Putnam, N. H., Zhou, S., Allen, A. E., Apt, K. E., Bechner, M., Brzezinski, M. A., Chaal, B. K., Chiovitti, A., Davis, A. K., Demarest, M. S., Detter, J. C., Glavina, T., Goodstein, D., Hadi, M. Z., Hellsten, U., Hildebrand, M., Jenkins, B. D., Jurka, J., Kapitonov, V. V., Kröger, N., Lau, W. W. Y., Lane, T. W., Larimer, F. W., Lippmeier, J. C., Lucas, S., Medina, M., Montsant, A., Obornik, M., Parker, M. S., Palenik, B., Pazour, G. J., Richardson, P. M., Rynearson, T. A., Saito, M. A., Schwartz, D. C., Thamatrakoln, K., Valentin, K., Vardi, A., Wilkerson, F. P. & Rokhsar, D. S. 2004. The genome of the diatom Thalassiosira pseudonana: Ecology, evolution and metabolism. Science, 306:79-86. Baginsky, S., Kleffmann, T., von Zychlinski, A & Gruissem, W. 2005. Analysis of shotgun proteomics and RNA profiling data from Arabidopsis thaliana chloroplasts. J. Prot. Res., 4:637--640. Barbrook, A. C., Howe, C. J. & Purton, S. 2006. Why are plastid genomes retained in non-photosynthetic organisms. Trends Plant Sci., 11:101--108. Baurain, D., Brinkmann, H., Petersen, J., Rodríguez-Ezpeleta, N., Stechmann, A., Demoulin, V., Roger, A. J., Burger, G., Lang, B. F. & Philippe, H. 2010. Phylogenomic evidence for separate acquisition of plastids in cryptophytes, haptophytes and stramenoiles. Mol. Biol. Evol., 27:1698--1709. Beakes, G. W. & Sekimoto, S. 2009. The evolutionary phylogeny of oomycetes - insights gained from studies of holocarpic parasites of algae and invertebrates. In: K. Lamour and S. Kamoun (ed.), Oomycete Genetics and Genomics: Diversity, Interactions, and Research Tools. John Wiley & Sons, Inc., Hoboken, NJ, USA. doi: 10.1002/9780470475898.ch1.

Birch, P. R. J., Rehmany, A. P., Pritchard, L., Kamoun, S. & Beynon, J. L. 2006. Trafficking arms: oomycete effectors enter host plant cells. Trends Microbiol., 14:8--11.

Bittner, L., Halary, S., Payri, C., Cruaud, C., de Reviers, B., Lopez, P. & Bapteste, E. 2010. Some considerations for analyzing biodiversity using integrative metagenomics and gene networks. Biol. Direct, 5:doi:10.1186/1745-6150-5-47.

Page 52: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

42

Bodyl, A. & Moszczynski, K. 2006. Did the peridinin plastid evolve through tertiary endosymbiosis? A hypothesis. Eur. J. Phycol., 41:435--448. Bodyl, A. 2005. Do plastid-related characters support the chromalveolate hypothesis? J. Phycol., 41:712--719. Bodyl, A., Stiller, J. W. & Mackiewicz, P. 2009. Chromalveolate plastids: direct descent or multiple endosymbiosis. Trends Ecol. Evol., 3:119--121. Bowler, C., Allen, A. E., Badger, J. H., Grimwood, J., Jabbari, K., Kuo, A., Maheswari, U., Martens, C., Maumus, F., Otillar, R. P., Rayko, E., Salamov, A., Vandepoele, K., Beszteri, B., Gruber, A., Heijde, M., Katinka, M., Mock, T., Valentin, K., Verret, F., Berges, J. A., Brownlee, C., Cadoret, J. P., Chiovitti, A., Choi, C. J., Coesel, S., De Martino, A., Detter, J. C., Durkin, C., Falciatore, A., Fournet, J., Haruta, M., Huysman, M. J., Jenkins, B. D., Jiroutova, K., Jorgensen, R. E., Joubert, Y., Kaplan, A., Kroger, N., Kroth, P. G., La Roche, J., Lindquist, E., Lommer, M., Martin-Jezequel, V., Lopez, P. J., Lucas, S., Mangogna, M., McGinnis, K., Medlin, L. K., Montsant, A., Oudot-Le Secq, M. P., Napoli, C., Obornik, M., Parker, M. S., Petit, J. L., Porcel, B. M., Poulsen, N., Robison, M., Rychlewski, L., Rynearson, T. A., Schmutz, J., Shapiro, H., Siaut, M., Stanley, M., Sussman, M. R., Taylor, A. R., Vardi, A., von Dassow, P., Vyverman, W., Willis, A., Wyrwicz, L. S., Rokhsar, D. S., Weissenbach, J., Armbrust E. V., Green B. R., Van de Peer, Y., Grigoriev, I. V.. 2008. The Phaeodactylum genome reveals the evolutionary history of diatom genomes. Nature, 456:239--244. Burki, F., Shalchian-Tabrizi, K., Minge, M., Skjaevelane, A. Nikolaev, S. I., Jakrobsen, K. S. & Pawlowski, J. 2007. Phylogenomics reshuffles the eukaryotic supergroups. PLoS One, 2:e790. Cavalier-Smith, T. 1999. Principles of protein and lipid targeting in secondary symbiogenesis: euglenoid, dinoflagellate, and sporozoan plastid origins and the eukaryote family tree. J. Eukaryot. Microbiol., 46: 347--366. Cavalier-Smith, T. 2003. Genomic reduction and evolution of novel genetic membranes and protein-targeting machinery in eukaryote-eukaryote chimaeras (meta-algae). Philos. Trans. R. Soc. Lond. B. Biol., 359:109--134. Chan, C. X., Reyes-Prieto, A. & Bhattacharya, D. 2011. Red and green algal origin of diatome membrane transporters: Insights into enviromental adaptation and cell evolution. PloS ONE, 6(12):e29138. doi:10.1371/journal.pone.0029138

Cock, J. M., Sterck, L., Rouze, P., Scornet, D., Allen, A. E., Amoutzias, G., Anthouard, V., Artiguenave, F., Aury, J. M., Badger, J. H., Beszteri, B., Billiau, K., Bonnet, E., Bothwell, J. H., Bowler, C., Boyen, C., Brownlee, C., Carrano, C. J., Charrier, B., Cho, G. Y., Coelho, S. M., Collen, J., Corre, E., Da Silva, C., Delage, L., Delaroque, N., Dittami, S. M., Doulbeau, S., Elias, M., Farnham, G., Gachon, C. M. M., Gschloessl, B., Heesch, S., Jabbari, K. Jubin, C., Kawai, H., Kimura, K., Kloareg, B., Küpper, F. C.,

Page 53: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

43

Lang, D., Le Bail, A., Leblanc, C., Lerouge, P., Lohr, M., Lopez, P. J., Martens, C., Maumus, F., Michel, G., Miranda-Saavedra, D., Morales, J., Moreau, H., Motomura, T., Nagasato, Ch., Napoli, C. A., Nelson, D. R., Nyvall-Collén, P., Peters, A. F., Pommier, C., Potin, P., Poulain, J., Quesneville, H., Read, B., Rensing, S. A., Ritter, A., Rousvoal, S., Samanta, M., Samson, G., Schroeder, D. C., Ségurens, B., Strittmatter, M., Tonon, T., Tregear, J. W., Valentin, K., von Dassow, P., Yamagishi, T., Van de Peer, Y., & Wincker, P. 2010. The Ectocarpus genome and the independent evolution of multicellularity in brown algae. Nature, 465:617--621.

De Koning, A. P. & Keeling, P. J. 2004 Nucleus-encoded genes for plastid-targeted proteins in Helicosporidium: functional diversity of a cryptic plastid in a parasitic alga. Eukaryot. Cell, 3:1198--1205.

Delwiche, C. F. 1999. Tracing the thread of plastid diversity through the tapestry of life. Am. Nat., 154:S164--S177.

Dodge, J. D. 1975. A survey of chloroplast ultrastructure in the dinophyceae. Phycologia 14:253-–263. Dong, J., Chen, C. & Chen, Z. 2003. Expression profiles of the Arabidopsis WRKY gene superfamily during plant defense response. Plant Mol. Biol., 51:21--37. Dorrell, R. G. & Smith, A. G. 2011. Do red and green make brown?: perspectives on plastid acquisitions within chromalveolates. Eukaryotic Cell, 10:856--868. Drummond, A. J., Ashton, B., Buxton, S., Cheung, M., Cooper, A., Duran, C., Field, M., Heled, J., Kearse, M., Markowitz, S., Moir, R., Stones-Havas, S., Sturrock, S., Thierer, T. & Wilson, A. 2011. Geneious v5.5. www.geneious.com. Elias, M. & Archibald, J. M. 2009. Sizing up the genomic footprint of endosymbiosis. BioEssays, 31:1273--1279. Emanuelsson, O., Nielsen, H. & von Heijne, G. 1999. ChloroP, a neural network-based method for predicting chloroplast transit peptides and their cleavage sites Prot. Sci., 8:978--984 Foth, B. J. & McFadden, G. I. 2003. The apicoplast: a plastid in Plasmodium falciparum and other Apicomplexan parasites. Int. Rev. Cytol. 224:57--110. Gaulin, E., Madoui, M. A., Bottin, A., Jacquet, C., Mathé, C., Couloux, A., Wincker, P., Dumas, B. 2008. Transcriptome of Aphanomyces euteiches: new oomycete putative pathogenicity factors and metabolic pathways. PLoS ONE, doi:10.1371/journal.pone.0001723 Gibbs, S. 1981a. The chloroplast endoplasmic reticulum: structure, function, and evolutionary significance. Int. Rev. Cytol., 72:49--99.

Page 54: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

44

Gibbs, S. 1981b. The chloroplast of some algal groups may have evolved from endosymbiotic eukaryotic algae. Ann. N.Y. Acad. Sci., 361:193--208. Green, B. R. 2011. After the primary endosymbiosis: an update on the chromalveolate hypothesis and the origins of algae with Chl c. Photosynth. Res., 107:103--115. Gruber, A., Vugrinec, S., Hempel, F., Gould, S. B., Maier, U. G. & Kroth, P. G. 2007. Protein argeting into complex diatom plastids: functional characterisation of a specific targeting motif. Plant Mol. Biol. 64:519--530. Gschloessl, B., Guermeur, Y. & Cock, J. M. 2008. HECTAR: A method to predict subcellular targeting in heterokonts. BMC Bioinformatics, doi: 10.1186/1471-2105-9-393. Guillot, M. & Gibbs, S. 1980a. Evidence that the chloroplast and nucleomorph of cryptomonads are remnants of a eukayrotic symbiont. J. Cell Biol., 87:186. Guillot, M. & Gibbs, S. 1980b. The cryptomonad nucleomorph: its ultrastructure and evolutionary significance. J. Phycol., 16:558--568 Guindon, S., Dufayard, J. F., Lefort, V., Anisimova, M., Hordijk, W. & Gascuel, O. 2010. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Sys. Biol., 59:307--321. Hackett, J. D., Yoon, H. S., Li, S., Reyes-Prieto, A., Rümmele, S. E. & Bhatta charya, D. 2007. Phylogenomic analysis supports the monophyly of cryptophytes and haptophytes and the association of rhizaria with chromalveolats. Mol. Biol. Evol. 24:1702--1713. Hackett, J. D., Yoon, H. S., Soares, M. B., Bonaldo, M. F., Casavant, T. L., Sheetz, T. E., Nosenko, T. & Bhattacharya, D. 2004. Migration of the plastid genome to the nucleus in a peridinin dinoflagellates. Curr. Biol., 14:213--218. Harper, J. T., Waanders, E. & Keeling, P. J. 2005. On the monophyly of chromalveolates using a six-protein phylogeny of eukaryotes. Int. J. Syst. Evol. Micr., 55:487--496. Huang, J., Mullapudi, N., Lancto, C. A., Scott, M., Abrahamsen, M. S. & Kissinger, J. C. 2004. Genomic evidence supports past endosymbiosis, intracellular and horizontal gene transfer in Cryptosporidium parvum. Genome Biol., 11:R88. Iida, K. Takishita, K., Ohshima, K. & Inagaki, Y. 2007. Assessing the monophyly of chlorophyll-c containing plastids by multi-gene phylogenies under the unlinked model conditions. Mol. Phylogenet. Evol., 45:227--238.

Page 55: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

45

Janouskovec, J., Horak, A., Obornik, M., Lukes, J. & Keeling, P. J. 2010. A common red algal origin of the apicomplexan, dinoflagellates and heterokont plastids. Proc. Natl. Acad. Sci., 107:10949--10954. Jiang, R. H., Tyler, B. M., Whisson, S. C., Hardham, A. R. & Govers, F. 2006. Ancient origin of elicitin gene clusters in Phytophthora genomes. Mol. Biol. Evol., 2:338--351. Kamoun, S. 2006. A catalogue of the effector secretome of plant pathogenic oomycetes. Annu. Rev. Phytopathol., 44:41--60. Keeling, P. J. 2004. Diversity and evolutionary history of plastids and their hosts. Am. J. Bot., 91:1481--1493. Keeling, P. J. 2009. Role of horizontal gene transfer in the evolution of photosynthetic eukaryotes and their plastids. Methods Mol. Biol., 532:501--515. Khan, H., Parks, N., Kozera, C., Curtis, B. A., Parsons, B. J., Bowman, S. & Archibale, J. M. 2007. Plastid genome sequence of the cryptophytes alga, Rhodomonas salina CCMP1319: lateral transfer of putative DNA replication machinery and a test of chromist plastid phylogeny. Mol. Biol. Evol., 24: 1832--1842. Kleffmann, T., Russenberger, D., von Zychlinski, A., Christopher, W., Sjolander, K., Gruissem, W. & Baginsky, S. 2004. The Arabidopsis thaliana chloroplast proteome reveals pathway abundance and novel protein functions. Curr. Biol., 14:354--362. Kleffmann, T., Hirsch-Hoffmann, M. Gruissem, W. & Baginsky, S. 2006. plprot: a comprehensive proteome database for different plastid types. Plant Cell Physiol., 47:432--436. Köhler, S., Delwiche, C. F., Denny, P. W., Tilney, L. G., Webster, P., Wilson, R. J., Palmer, J. D. & Roos, D. S. 1997. A plastid of probable green algal origin in apicomplexan parasites. Science, 275:1485--1489. Kolaczkowski, B. & Thornton, J. W. 2008. A mixed branch length model of heterotachy improves phlogenetic accuracy. Mol. Biol. Evol., 25:1054--1066. Kroth, P. G. 2002. Protein transport into secondary plastids and the evolution of primary and secondary plastids. Int. Rev. Cytol., 221:191--255. Lane, C. E. & Archibald, J. M. 2008. The eukaryotic tree of life: endosymbiosis takes its TOL. Trends Ecol. Evol., 5:268--275. Larkum, A. W. D., Lockhart, P. J. & Howe, C. J. 2007. Shopping for plastids. Trends Plant Sci., 12:189--195. Lee J. J., Leedale G. F. & Bradbury P. (eds) 2000. Illustrated Guide to the Protozoa.

Page 56: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

46

2nded., Society of Protozoologists, Allen Press, Lawrence, Kansas. Marchler-Bauer, A., Anderson, J. B., Derbyshire, M. K., DeWeese-Scott, C., Gonzales, N. R., Gwadz, M., Hao, L., He, S., Hurwitz, D. I., Jackson, J. D., Ke, Z., Krylov, D., Lanczycki, C. J., Liebert, C. A., Liu, C., Lu, F., Lu, S., Marchler, G. H., Mullokandov, M., Song, J. S., Thanki, N., Yamashita, R. A., Yin, J. J., Zhang, D. & Bryan, S. H. 2007. CDD: a conserved domain database for interactive domain family analysis. Nucleic Acid Res., 35:D237--240. Moustafa, A., Beszteri, B., Maier, U. G., Bowler, C., Valentin, K. & Bhattacharya, D. 2009. Science, 324:1724--1726. Okamoto, N., Chantangsi, C., Horák, A., Leander, B. S. & Keeling, P. J. 2009. Molecular phylogeny and description of the novel katablepharid Roombia truncate gen. et sp. Nov., and establishment of the hacrobia taxon nov. PLoS ONE. 4:e7080. doi:10.1371/journal.pone.0007080. Pagel, M. & Meade, A. 2008. Modelling heterotachy in phylogenetic inference by reversible-jump Markov chain Monte Carlo. Phil. Trans. R. Soc. B., 363:3955--3964. Parfrey, L. W., Grant, J., Tekle, I. Y., Lasek-Nesselquist, E., Morrison, H. G., Sogin, M. L., Patterson, D. J. & Katz, L. A. 2010. Broadly sampled multigene analyses yield a well-resolved eukaryotic tree of life. Syst. Biol., 59:518--533. Patron, N. J., Inagaki, Y. & Keeling, J. P. 2007 Multiple gene phylogenies support the monophyly of cryptomonads and haptophytes host lineages. Curr. Biol.,17:887-891. Petersen, T. N., Brunak, S., von Heijne, G. & Nielsen, H. 2011. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nature Methods, 8:785--786. Philippe, H., Zhou, Y., Brinkmann, H., Rodrigue, N. & Delsuc, F. 2005. Heterotachy and long-branch attraction in phyloogenetics. BMC Evol. Biol. 5:50. Doi:10.1186/1471-2148-5-50. Ralph, S. A., van Dooren, G. G., Waller, R. F., Crawford, M. J., Fraunholz, J. J., Foth, B. J., Tonkin, C. J., Roos, D. S. & McFadden, G. I. 2004. Metabolic maps and functions of the Plasmodium falciparum apicoplast. Nature Rev. Microbiol., 2:203--216. Reyes-Prieto, A., Moustafa, A. & Bhattacharya, D. 2008. Multiple genes of apparent algal origin suggest ciliates may once have been photosynthetic. Curr. Biol., 13:956--962. Rice, D. W. & Palmer, J. D. 2006. An exceptional gene transfer in plastids: gene replacement by a distant bacterial paralog and evidence that haptophytes and cryptophytes plastids are sisters. BMC Biol., 4:31.

Page 57: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

47

Rogers, M. B., Patron, N. J. & Keeling, P. J. 2007. Horizontal transfer of a eukarotic plastid--targeted protein ene to cyanobacteria. BMC Biol., 5:26. Sanchez-Puerta, M. V & Delwiche, C. F. 2008. A hypothesis for plastid evolution in chromalveolates. J. Phycol., 44:1097--1107. Sanchez-Puerta, M. V., Lippmeier, J. C., Apt, K. E. & Delwiche, C. F. 2007. Plastid genes in a non-photosynthetic dinoflagellate. Protist, 158:105--117. Sekimoto, S., Klochkova, T. A., West, J. A., Beakes, G. W. & Honda, D. 2009. Olpidiopsis bostrychiae sp. Nov.: an endoparasitic oomycete that infects Bostrychia and other red algae (Rhodophyta). Phycologia, 48:460--472. Shindo, T., Misas-Villamil, J. C., Hörger A. C., Song, J. & van der Hoorn, R. A. L. 2012. A role in immunity for Arabidopsis cystein protease RD21, the ortholog of the tomato immune protease C14. PloS ONE, 7:e29317. Doi:10.1371/journal.pone.0029317. Slamovits, C. H. & Keeling, P. J. 2008. Plastid-derived genes in the nonphotosynthetic alveolates Oxyrris marinus. Mol. Biol. Evol., 25: 1297--1306. Soll, J. & Schleiff, E. 2004. Protein import into chloroplasts. Nature Rev. Mol. Cell Biol., 5:198--208. Stiller, J. W., Huang, J., Ding, Q., Tian, J. & Goodwillie, C. 2009. Are algal genes in nonphotosynthetic protists evidence of historical plastid endosymbiosis? BMC Genomics, doi:10.1186/1471-2164-10-484 Tatusov, R.L., Natale, D.A., Fedorova, N.D., Jackson, J., Jacobs, A., Krylov, D.M., Mekhedov, S.L., Nikolskaya, A.N., Rao, B.S., Wolf, Y.I., Aravind, L., Lanczycki, C., Masumder, R., Sreekumar, K., Vasudevan, S., Walker, D.R., Tatusova, T.A., Yao, K., Yin, J., Koonin, E.V. 2003. The COG database: an updated version includes

eukaryotes. BMC Bioinformatics. 4:41.

Tyler, B. M., Tripathy, S., Zhang, X., Dehal, P., Jiang, R. H. Y., Aerts, A., Arredondo, F. P., Baxter, L., Bensasson, D., Beynon, J. L., Chapman, J., Damasceno, C. M. B., Dorrance, A. E., Dou, D., Dickerman, A. W., Dubchak, I. L., Garbelotto, M., Gijzen, M., Gordon, S. G., Govers, F., Grunwald, N. J., Huang, W., Ivors, K. L., Jones, R. W., Kamoun, S., Krampis, K., Lamour, K. H., Lee, M. K., McDonald, W. H., Medina, M., Meijer, H. J. G., Nordberg, E. K., Maclean, D. J., Ospina-Giraldo, M. D., Morris, P. F., Phuntumart, V., Putnam, N. H., Rash, S., Rose, J. K. C., Sakihama, Y., Salamov, A. A., Savidor, A., Scheuring, C. F., Smith, B. M., Sobral, B. W. S., Terry, A., Torto-Alalibo, T. A., Win, J., Xu, Z., Zhang, H., Grigoriev, I. V., Rokhsar, D. S., Boore, J. L. 2006. Phytophthora genome sequences uncover evolutionary origins and mechanisms of pathogenesis. Science, 313:1261--1266.

Page 58: PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA

48

Whelan, S. & Goldman, N. 2001. A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol. Bio.l Evol,.18:691--699.

Wilson, R. J. M. 2004. Plastid functions in the Apicomplexa. Protist, 155:11--12. Woehle, C., Dagan, T., Martin, W. F. & Gould, S. B. 2011. Red and problematic green phylogenetic signals among thousands of nuclear genes from the photosynthetic and apicomplexa-related Chromera velia. Genome Biol. Evol., 3:1220--1230. Yoon, H. S., Hackett, J. D., Ciniglia, C., Pinto, G. & Bhattacharya, D. 2004. A molecular timeline for the origin of photosynthetic eukaryotes. Mol. Biol. Evol., 21:809--818. Yoon, H. S., Hackett, J. D., Pinto, G. & Bhattacharya, D. 2002. The single, ancient origin of chromist plastids. Proc. Natl. Acad. Sci. USA, 99:15507--15512.