Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
PLASTID-TARGETED PROTEINS ARE ABSENT FROM THE PROTEOMES OF ACHLYA HYPOGYNA AND THRAUSTOTHECA CLAVATA (OOMYCOTA,
STRAMENOPILA): IMPLICATIONS FOR THE ORIGIN OF CHROMALVEOLATE PLASTIDS AND THE ‘GREEN GENE’ HYPOTHESIS
Lindsay Rukenbrod
A Thesis Submitted to the University of North Carolina Wilmington in Partial Fulfillment
of the Requirements for the Degree of Master of Science
Center for Marine Science
University of North Carolina Wilmington
2012
Approved by
Advisory Committee
D. Wilson Freshwater Jeremy Morgan Allison Taylor J. Craig Bailey Chair
Accepted by
Dean, Graduate School
ii
This thesis has been prepared in the style and format consistent with the Journal of Eukaryotic Microbiology.
iii
TABLE OF CONTENTS
ABSTRACT .....................................................................................................................iv
ACKNOWLEDGMENTS ..................................................................................................vi
DEDICATION ................................................................................................................. vii
LIST OF TABLES .......................................................................................................... viii
LIST OF FIGURES ..........................................................................................................ix
CHAPTER 1: Implications for the origin of chromalveolate plastids ............................... X
INTRODUCTION .................................................................................................. 1
METHODS............................................................................................................ 3
RESULTS AND DISCUSSION ............................................................................. 4
Revised Hypotheses for the Evolution of Chromalveolate Plastids ............ 6
CHAPTER 2: Do chromalveolate genomes encode ‘green genes’? ............................ 15
INTRODUCTION ................................................................................................ 16
METHODS.......................................................................................................... 18
RESULTS AND DISCUSSION ........................................................................... 19
Green Genes in Oomycetes and Other Chromalveolates? ...................... 22
SUPPLEMENTAL INFORMATION................................................................................ 32
LITERATURE CITED .................................................................................................... 41
iv
ABSTRACT
Chapter 1
The chromalveolate hypothesis predicts that extant nonphotosynthetic stramenopiles
are secondarily nonphotosynthetic and derived from ancestors bearing a secondary red-
type plastid. To test this hypothesis, proteomes of the oomycetes Achlya hypogyna and
Thraustotheca clavata were canvassed for plastid-targeted genes. Proteins for each
species encoding putative plastid-targeting signal peptides were identified, annotated,
and assigned to protein families if possible. Forty-six candidate proteins were culled
from the two genomes. Bioinformatic analyses revealed that the proteomes of Achlya
and Thraustotheca do not encode plastid-targeted genes acquired by endosymbiotic
gene transfer. All proteins possessing non-mitochondrial-targeting signal peptides
identified were judged to belong to the secretome (i.e, extracellularly secreted proteins).
These results indicate that oomycetes are ancestrally aplastidic stramenopiles and do
not support the chromalveolate theory of plastid evolution. Revised hypotheses for the
origin of plastids characterized by chlorophylls a and c and fucoxanthin are presented. It
is concluded that alveolate and stramenopile plastids are likely tertiary or higher order
plastids, not secondary plastids.
Chapter 2
The hypothesis that a green algal symbiosis preceded the red algal symbiont that gave
rise to red-type plastids in the ancestors of the chromalveolates is reexamined. A
network approach was used to detect nuclear encoded proteins from the genomes of
Achlya hypogyna, Thraustotheca clavata, other oomycetes, and other chromalveolates
v
that cluster with green algal genes. Twelve oomycete proteins clustering with green
algal genes at high stringency were annotated and selected for further analyses.
Representative homologs from all other eukaryotic taxa available were aligned to
sequences comprising each network and maximum likelihood trees were constructed
from these alignments. Protein trees derived from these data exhibited obvious errors
resulting from taxon biases and heterotachy. These results argue that ‘green genes’
detected in phylogenomics studies are artifactual and not indicative of endosymbiotic
gene transfer.
vi
ACKNOWLEDGMENTS
My thanks go to my advisor, Dr. J. Craig Bailey, whose enthusiasm about
molecular protistology caught my interest in the very beginning of my scientific
education. His continuous encouragement, wit, and sense of humor made this journey
an enjoyable one. Ian Misner and Dr. Chris Lane of the University of Rhode Island have
also been instrumental in my education, providing feedback and technical support in my
research. I’d also like to thank my committee members, Dr. D. Wilson Freshwater, Dr.
Jeremy Morgan, and Dr. Allison Taylor, for their encouragement and flexibility
throughout this process.
My lab mates past and present, particularly Cory Dashiell, Erika Shwarz, Ashley
Hayes, and Allison Martin, helped me maintain my focus over the years throughout
failed DNA extractions, computer malfunctions, approaching deadlines, and many other
graduate school related challenges.
The Department of Biology and Marine Biology, the Center for Marine Science
and the National Science Foundation provided financial support for my education and
research.
Finally, I’d like to thank my parents and my husband for supporting me every step
of the way.
vii
DEDICATION
I’d like to dedicate this to my mother, whose endless patience has allowed me to
explore life with few restrictions and overwhelming love and support.
viii
LIST OF TABLES
Table Page
Chapter 1
1. Protein IDs for 46 hypothetical proteins detected in the genomes of
Thraustotheca and/or Achlya.. ............................................................................ 9
2. Protein ID numbers, annotations and protein family designations.. ................... 11
3. Proteins sorted into one of 14 unique protein families.. ..................................... 13
4. List of seven proteins from the Achlya and Thraustotheca and putative
homologs found in the Arabidopsis thaliana plastid proteome.. ........................ 14
Chapter 2
1. List of 12 annotated proteins from the Achlya and/or Thraustotheca proteomes
or other oomycetes found in EGNs . ........................................................................ 24
ix
LIST OF FIGURES
Figure Page
Chapter 1
1. Hypotheses for the origin of complex, higher order
chlorophyll a+c-containing plastids in chromalveolates. ....................................... 8
Chapter 2
1. Three examples of putative green genes in oomycete
genomes based on EGN analysis. .................................................................... 25
2. DEXDc ML tree ................................................................................................... 26
3. RPB ML tree ....................................................................................................... 27
4. ALDH ML tree ..................................................................................................... 28
5. TOR-containing kinase ML tree .......................................................................... 29
6. YAK1 ML tree: .................................................................................................... 30
7. ALS ML tree....................................................................................................... 31
x
CHAPTER 1: Implications for the origin of chromalveolate plastids.
INTRODUCTION
The evolutionary origin and subsequent movement of secondary and higher order
plastids among photosynthetic eukaryotes is the subject of intense debate. The principal
key to unraveling the evolutionary history of plastids is an accurate understanding of the
relationships among both host and plastid lineages (Archibald 2009; Green 2011). This
goal is hampered by the mosaic nature of eukaryotic genomes comprised of lineage-
specific genes inherited vertically, thousands of genes acquired by endosymbiotic gene
transfer (EGT), and genes obtained via lateral gene transfer (LGT) (Archibald 2008;
Green 2011; Keeling 2009; Larkum 2007).
The chromalveolate hypothesis posits that the alveolates, cryptomonads,
haptophytes and stramenopiles are monophyletic and that the last common ancestor of
these lineages was a photosynthetic alga bearing a red-type plastid (Cavalier-Smith
1999; 2003). This notion is supported, in the first instance, by the fact that
photosynthetic members of these chlorophyll a+c-containing groups all possess red-
type plastids surrounded by three or four unit membranes [the so-called chloroplast-
endoplasmic reticulum, or CER], a feature indicative of secondary endosymbiosis
(Dodge 1975; Foth and McFadden 2003; Guillot and Gibbs 1980a, b; Gibbs 1981a, b;
Köhler et al. 1997). Second, nuclear-encoded plastid-targeted proteins in these algae
are characterized by the presence of a 5’ bipartite signal sequence that directs gene
products to the plastid and across the outer- and inner-pair of plastid membranes (Kroth
2002; Soll and Schleiff 2004). In terms of coding capacity, gene content, and
organization the plastid genomes of chromalveolates resemble those of red algae far
2
more closely than they resemble the plastid genomes of green algae (Delwiche 1999;
Keeling 2004; Yoon et al. 2002). Cavalier-Smith (1999) originally emphasized the
chromalveolate hypothesis is consistent with idea that the chloroplast endoplasmic-
reticulum (CER) and complex protein-trafficking systems that characterize
chromalveolates are unlikely to have evolved independently on different occasions (see
Kroth 2002; Ralph et al. 2004).
Over the last decade, tests of the ‘chromalveolate’ concept has been the subject
– implicitly or explicitly – of numerous broad-scale phylogenetic studies. The
chromalveolates have not been recovered as a monophyletic group in any study
(Archibald 2009, Baurain et al. 2010). More recent studies imply the relationships
among chromalveolate host cells and their plastids is more complex than originally
supposed, perhaps involving tertiary and higher-order transfers among hosts (Archibald
2009; Bodyl 2005; Keeling 2004; Sanchez-Puerta and Delwiche 2008). In this paper
the chromalveolate hypothesis is re-examined in light of new genomic data available for
nonphotosynthetic members of the Stramenopila.
The stramenopiles, one of the four principal taxa included in the Chromalveolata,
are divided into two groups. (i)The ‘photosynthetic stramenopiles’, ‘heterokont algae’ or
‘ochrophytes’ - is comprised of chlorophyll a+c-containing photosynthetic algae
including phaeophytes, chrysophytes, and diatoms, eustigmatophytes, pelagophytes,
and xanthophytes (Lee et al. 2000). (ii) Nonphotosynthetic organisms that are
bactivorous, parasitic or saprobic heterotrophs in nature including bicosoecieds,
hyphochytrids, labyrinthulids, oomycetes, thraustochytrids, among others (Lee et al.
3
2000). The oomycetes are the most diverse, well studied, and economically important
of all nonphotosynthetic stramenopiles.
The chromalveolate hypothesis implies that extant aplastidic stramenopiles are
derived from ancestors that once possessed a secondary red-type plastid. However,
there is no ultrastructural or DNA evidence suggesting that bicosoecieds, hyphochytrids,
labyrinthulids, oomycetes, or thraustochytrids possess, or possessed in the past, a
plastid. Furthermore, ultrastructural or DNA sequence evidence for cryptic plastids in
these organisms is absent or controversial (Lee et al 2000; Reyes-Prieto et al. 2008;
Slamovits and Keeling 2008; Stiller et al. 2009).
In this study the proteomes of the oomycetes Achlya hypogyna and
Thraustotheca clavata, were canvassed in search of photosynthesis related genes.
METHODS
Full length predicted proteins were obtained from ongoing genome sequencing projects
for Achlya hypogyna (ATCC48635) and Thraustotheca clavta (ATCC34112) estimated
to encode 17,430 and 12,154 predicted proteins, respectively; additional details will be
published separately. The Achlya and Thraustotheca proteomes were searched for
possible plastid-targeted genes using the signal peptide prediction program ChloroP
(v.1.1) (Emanuelsson et al. 1999). Hypothetical proteins returned from these searches
were subsequently analyzed using SignalP (v.4.0) (Petersen et al. 2011), annotated and
assigned to protein families if possible using the Conserved Domain Database (CDD)
(Marchler-Bauer et al. 2007). Mitochondria-targeted proteins and proteins possessing
4
transmembrane regions identified using TmHMM (v2.0) were removed from the data set
(Krogh et al. 2001). Searches for heterokont-like bipartite plastid-targeting peptides,
consisting of both signal and transit peptide motifs, were conducted using HECTAR
(Gruber et al. 2007; Gschloessl et al. 2008; Waller et al. 2000). Finally, the oomycete
proteins were BLASTed against the Arabidopsis thaliana plastid proteome database
(which includes plastid- and nuclear-encoded plastid targeted proteins) using plprot
v.2.3 (Baginsky et al. 2005; Kleffmann et al. 2004; 2006).
RESULTS AND DISCUSSION
The chromalveolate hypothesis implies that the ancestors of oomycetes were
photosynthetic organisms bearing red-type plastids and putative plastid-related genes
have been reported from the genomes of the plant pathogens Phytophthora ramorum
and P. sojae (Tyler et al. 2006). The competing hypothesis is the long-held view that
oomycetes are ancestrally aplastidic. It is possible that the ancestors of oomycetes
were photosynthetic but that extant members of group have not retained any plastid-
associated genes. On the other hand, empirical data including studies of
apicomplexans, dinoflagellates and other taxa imply plastid-associated genes are
unlikely to be completely purged from the genome even in organisms where a vestigal,
nonphotosynthetic plastid is absent (Barbrook et al. 2006; de Koning and Keeling 2004;
Matsuzuki et al. 2008; Wilson 2004; Sanchez-Puerta et al. 2007).
Thirty hypothetical proteins from the Achlya genome and 16 from the
Thraustotheca genome putatively possessing a 5’ plastid-targeting signal peptide were
5
identified (Table 1). Of these 46 proteins 22 are presently characterized as hypothetical
proteins of unknown function; 24 of the proteins were annotated (<1.00E-25) and found
to represent 14 unique protein families (Tables 2, 3).
BLASTp queries revealed that none of the oomycete proteins (Table 2) are
encoded by the 271 eukaryotic plastid genomes sequenced to date. None of the 46
presequences examined here possess the ASAFP (Y/W/L) motif necessary for plastid
import in diatoms, although the significance of this observation is unclear (Gruber et al.
2007) (see supplementary Tables S1 and S2).
Putative homologs to seven of the oomycete proteins were detected in the A.
thaliana plastid proteome (Table 4). These seven oomycete proteins are more-or-less
distant relatives of three A. thaliana genes. Both Achlya and Thraustotheca encode
proteins similar to the zinc-finger type WRKY1 DNA-binding transcription factor that
plays a role in disease resistance in A. thaliana (Dong et al. 2003; Shindo et al. 2012)
Three Achlya and one Thraustotheca proteins putatively encoding cysteine proteinase
RD21A are shared in common with the A. thaliana plastid proteome. Finally, a single
Achlya protein distantly related (6E-17) to A. thaliana aldehyde dehydrogenase (ALDH)
was also detected.
These three genes are not indicators for photosynthesis per se because
homologs have been detected from across the tree of life in photosynthetic (e.g., plants
and green algae) and nonphotosynthetic organisms (e.g., eubacteria, animals, fungi,
and the rhizarian Dictyostelium). Homologs, more closely related to the Achlya and
Thraustotheca proteins, to each of these putative genes have been previously detected
6
in the genomes of Phytophthora infestans, P. sojae (Pythiales) and the white rust
Albugo laibachii (Tyler et al. 2006).
The annotated proteins recovered in this study include nine know to belong to
oomycete secretomes and six of these are common proteases such as chitinase and
cellulase (Tables 2, 3: Birch et al. 2006; Gaulin et al. 2008; Kamoun 2006; Levesque et
al. 2010). One of the proteins belongs to the elicitin family; a family of virulence genes
unique to oomycetes (Jiang et al. 2006). Based upon these data, plastid-associated
genes are not present in the Achlya or Thraustotheca predicted proteomes.
Revised Hypotheses for the Evolution of Chromalveolate Plastids
These data, as well as the study by Stiller et al. (2009), indicate that oomycetes are
ancestrally aplastidic despite reports to the contrary (Tyler et al. 2006). This information
and the results of recent phylogenomics investigations have been synthesized and
revised hypotheses for the evolution of chromalveolate plastids are presented in Figures
1 and 2. These diagrams reflect a number of assumptions that are enumerated for the
sake of clarity. (i) The Chromalveolata sensu stricto is paraphyletic (e.g., , Iida et al.
2007; Khan et al. 2007; reviewed in Green 2011; Rogers et al. 2007). (ii) )omycetes, all
other heterotrophic stramenopiles, as well as the ciliates are ancestrally aplastidic
(Archibald 2008; Reyes-Prieto et al. 2008; Tyler et al. 2006). (iii) The SAR clade is
recognized as natural (Burki et al. 2007; Hackett et al. 2007; Lane & Archibald 2008).
Fourth, recent studies imply that SAR and Hacrobia host cells are likely distantly related
(Baurain et al. 2010; Hackett et al. 2007; Parfrey et al. 2010). For these reasons, no
7
specifically defined relationship between SAR and Hacrobia host cells is implied in
Figure 1. The diagrams comprising Figure 1 are drawn under the assumption that the
Hacrobia is monophyletic (Burki et al. 2007; Hackett et al. 2007; Harper et al. 2005,
Patron et al. 2007).
These hypotheses share elements in common with prior models of
chromalveolate plastid evolution in which multiple plastid acquisitions (or plastid
replacements) are inferred via serial endosymbiotic transfer (Archibald 2008; Bodyl
2005; Bodyl et al. 2009; Bodyl and Moszczynski 2006; Sanchez-Puerta & Delwiche
2008). Two predictions derived from these models bear emphasizing: (1) Alveolates
and Stramenopiles likely possess tertiary or quarternary plastids and (2) it is
conceivable that one of these taxa, the alveolates or stramenopiles, may have obtained
their plastid from the other (Fig. 1). Finally, it is noted that the number of membranes
surrounding higher-order, complex plastids seems to be fixed at four or less.
8
Fig. 1 Hypotheses for the origin of complex, higher order chlorophyll a+c-containing plastids in chromalveolates. (A) Independent acquisition of a tertiary (3°) plastid in the alveolate and stramenopile lineages from the Hacrobia lineage. (B) Serial endosymbiotic transfer resulting in a quarternary (4°) alveolate plastid from the 3° stramenopile plastid. (C) ) Serial endosymbiotic transfer resulting in a 4° stramenopile plastid from the 3° alveolate plastid.
9
Table 1. Protein IDs for 46 hypothetical proteins detected in the genomes of Thraustotheca and/or Achlya characterized by the presence of a putative plastid-targeting 5’ signal peptide sequence. ChloroP was used to detect classical plastid transit peptides. HECTOR was used to search for bipartite plastid targeting leader sequences characteristic of stramenopiles and other chromalveolates (Kilian and Kroth 2003, McFadden and van Dooren 2004, Vesteg et al. 2009).
Protein ID SignalP ChloroP HECTAR
Thraustotheca clavata
THRCLA_02069 Y Y Chloroplast THRCLA_03737 Y Y Signal peptide THRCLA_03876 Y Y Signal peptide THRCLA_04285 Y - Signal peptide THRCLA_04386 Y Y Signal peptide THRCLA_04952 N Y Signal peptide THRCLA_05863 Y Y Signal peptide THRCLA_06099 Y Y Signal peptide THRCLA_07047 Y Y Signal peptide THRCLA_08011 Y - Signal peptide THRCLA_10855 N - Signal peptide THRCLA_10997 N - Signal peptide THRCLA_11248 Y Y No N-terminal target
peptide found THRCLA_11271 Y Y Chloroplast THRCLA_11391 Y Y Signal peptide THRCLA_11516 Y Y Signal peptide
Achlya hypogyna
ACHHYP_00269 Y Y Signal Peptide ACHHYP_01095 Y - Signal peptide ACHHYP_01226 Y - Signal peptide ACHHYP_01546 Y Y Signal peptide ACHHYP_02169 Y Y Chloroplast ACHHYP_02305 Y Y Signal peptide ACHHYP_03044 Y Y Signal peptide ACHHYP_03052 N - Signal peptide ACHHYP_04549 Y Y Chloroplast ACHHYP_04706 Y Y Signal peptide ACHHYP_04908 Y Y Signal peptide ACHHYP_05005 Y Y Signal peptide
A
B
3°
A
B
10
Table 1 cont
ACHHYP_05180 Y Y Signal peptide
ACHHYP_05326 Y Y Signal peptide
ACHHYP_05770 Y Y Signal peptide
ACHHYP_06287 Y - Signal peptide
ACHHYP_06505 Y - Signal peptide
ACHHYP_06977 Y Y Signal peptide
ACHHYP_07400 Y Y Chloroplast
ACHHYP_08323 Y - Signal peptide
ACHHYP_09221 Y Y Chloroplast
ACHHYP_09519 Y Y Chloroplast
ACHHYP_10824 Y Y Signal peptide
ACHHYP_11025 Y Y Chloroplast
ACHHYP_11286 Y - Signal peptide
ACHHYP_11397 Y Y Signal peptide
ACHHYP_12628 Y - Chloroplast
ACHHYP_13722 Y Y Chloroplast
ACHHYP_14385 Y Y Signal peptide
ACHHYP_15409 Y Y Chloroplast
11
Table 2. Protein ID numbers, annotations (<1.00E-25), and protein family designations for 46 proteins from the Thraustotheca and Achlya genomes putatively possessing 5’ plastid-targeting signal peptides.
Gene/Protein ID Annotation pfam
Thraustotheca clavate
THRCLA_02069 putative GPI-anchored serine-rich hypothetical protein _
THRCLA_03737 cd05384: SCP_PRY1_like [COG2340] pfam00188:
THRCLA_03876 hypothetical protein, with EGF-like motif _
THRCLA_04285 Kazal-type serine proteinase inhibitor pfam7648
THRCLA_04386 hypothetical protein _
THRCLA_04952 hypothetical protein _
THRCLA_05863 hypothetical protein _
THRCLA_06099 putative GPI-anchored serine-rich hypothetical protein _
THRCLA_07047 hypothetical protein _
THRCLA_08011 cysteine protease family C01A, putative pfam00112
THRCLA_10855 hypothetical protein _
THRCLA_10997 chitinase D-like pfam00704
THRCLA_11248 hypothetical protein, unknown function _
THRCLA_11271 hypothetical protein, elicitin superfamily pfam00964
THRCLA_11391 beta-N-acetylglucosaminidase pfam00728
THRCLA_11516 hypothetical protein, unknown function _
Achlya hypogyna
ACHHYP_00269 putative GPI-anchored serine-rich hypothetical protein _
ACHHYP_01095 beta-N-acetylglucosaminidase pfam00728
ACHHYP_01226 hypothetical protein pfam12937
ACHHYP_01546 hypothetical protein _
ACHHYP_02169 trypsin-like serine protease pfam13365
ACHHYP_02305 putative GPI-anchored serine-rich hypothetical protein _
ACHHYP_03044 putative chitinase-like carbohydrate-binding protein pfam00704
ACHHYP_03052 hypothetical protein _
ACHHYP_04549 hypothetical protein _
ACHHYP_04706 hypothetical protein encoding ricin_B_lectin pfam00652
ACHHYP_04908 hypothetical protein _
ACHHYP_05005 puative D-lactate dehydrogenase pfam01565
12
Table 2 cont
ACHHYP_05180 hypothetical protein _
ACHHYP_05326 Cellulose pfam00150
ACHHYP_05770 hypothetical protein _
ACHHYP_06287 hypothetical protein _
ACHHYP_06505 papain family cysteine protease pfam00112
ACHHYP_06977 hypothetical protein _
ACHHYP_07400 hypothetical protein _
ACHHYP_08323 hypothetical protein containing PAN domain pfam00024
ACHHYP_09221 hypothetical protein _
ACHHYP_09519 hypothetical protein encoding ricin_B_lectin pfam00652
ACHHYP_10824 ankyrin repeat protein pfam12796
ACHHYP_11025 hypothetical protein _
ACHHYP_11286 aldehyde dehydrogenase pfam0017
ACHHYP_11397 hypothetical protein _
ACHHYP_12628 papain-like cysteine protease C1 pfam00112
ACHHYP_13722 hypothetical protein _
ACHHYP_14385 hypothetical protein _
ACHHYP_15409 papain-like cysteine protease C1 pfam00112
13
Table 3. Proteins investigated in this study were sorted into one of 14 unique protein families, which are listed below. Note that all proteins investigated (see Table 2) are predicted to have a 5’ signal peptide and that nine of the 14 families include secreted proteins. Six of the families include proteases and that the elicitin family of virulence proteins are secreted extracellularly and is unique to oomycetes.
pfam ID Protein ID Protein family / Conserved domains
00188 THRCLA_03737 Cysteine-rich secretory protein family
07648 THRCLA_04285 Kazal_2: Kazal-type serine protease inhibitor domain
00112 THRCLA_08011 ACHHYP_06505 ACHHYP_12628 ACHHYP_15409
Peptidase_C1: Papain family cysteine protease
00964 THRCLA_11271 Elicitin
00728 THRCLA_11391 ACHHYP_01095
Glyco_hydro_20: Glycosyl hydrolase family 20, catalytic domain
12937 ACHHYP_01226 F-box-like
13365 ACHHYP_02169 Trypsin_2: Trypsin-like peptidase domain
00704 THRCLA_10997 ACHHYP_03044
Glyco_hydro_18: Glycosyl hydrolases family 18
00652 ACHHYP_04706 ACHHYP_09519
Ricin_B_lectin: Ricin-type beta-trefoil lectin domain
01565 ACHHYP_05005 FAD_binding_4: FAD binding domain
00150 ACHHYP_05326 Cellulase: Cellulase (glycosyl hydrolase family 5)
00024 ACHHYP_08323 PAN_1: PAN domain
12796 ACHHYP_10824 Ank_2: Ankyrin repeats
0017 ACHHYP_11286 aldehyde dehydrogenase superfamily (ALDH-SF)
14
Table 4. List of seven proteins from the Achlya hypogyna and Thraustotheca clavata oomycete genomes and putative homologs found in the Arabidopsis thaliana plastid proteome. Reference refers to functional studies of the genes identified in this analysis.
Oomycete A. thaliana
plastid protein ID proteome ID Gene annotation E - value Reference ACH_05770 plp_at_01492 disease resistance protein
related to DNA-binding protein WRKY1
2.00E-20
THR_04952 plp_at_01492 disease resistance protein related to DNA-binding protein WRKY1
7.00E-23
THR_08011 plp_at_00089 cysteine proteinase RD21A (=thiol protease RD21A)
4.00E-53 Shindo et al. 2012
ACH_15409 plp_at_00089 cysteine proteinase RD21A (=thiol protease RD21A)
1.00E-24 Shindo et al. 2012
ACH_12628 plp_at_00089 cysteine proteinase RD21A (=thiol protease RD21A)
1.00E-18 Shindo et al. 2012
ACH_06505 plp_at_00089 cysteine proteinase RD21A (=thiol protease RD21A)
9.00E-47 Shindo et al. 2012
ACH_11286 plp_at_00466 aldehyde dehydrogenase (ALDH)
6.00E-17
CHAPTER 2: Do chromalveolate genomes encode ‘green genes’? 1
2 3
16
INTRODUCTION 1
One of the most vexing problems in eukaryote systematics is the 2
interrelationships among the so-called ‘chromalveolates’ (Archibald 2008; Cavalier-3
Smith 1999; Green 2011; Keeling 2004). The Chromalveolata is a paraphyletic taxon 4
whose members can be divided into two groups: The first group (the SAR clade) 5
includes the Alveolates (apicomplexans, dinoflagellates, and ciliates) that are sister to 6
Stramenopiles (including phaeophytes, chrysophytes, oomycetes). In turn, these two 7
clades are sister to the Rhizaria, a group principally comprised of free-living amoebae 8
(Burki et al. 2007; Hackett et al. 2007; Lane and Archibald 2008; Rogers et al. 2007). 9
The second group, the Hacrobia, includes cryptomonads and haptophytes and lesser-10
known relatives such as the telonemids, centrohelids, and picobiliphytes (Burki et al. 11
2007; Elias and Archibald 2009; Hackett et al. 2007; Okamoto et al. 2009; Rice and 12
Palmer 2006; Patron et al. 2007). The exact relationship between host cells and plastids 13
belonging to members of the SAR and Hacrobia clades is unclear (Baurain et al. 2010; 14
Harper et al. 2005). Despite these uncertainties, it is clear that all photosynthetic 15
chromalveolates possess three or four membrane-bound secondary or higher-order 16
plastids ultimately derived from a red alga (Hackett et al. 2004; Janouskovec et al. 17
2010; Kahn et al. 2007; Yoon et al. 2002; 2004; Sanchez-Puerta et al. 2007). How 18
these plastids were acquired is a contentious issue but most recent models reflect a 19
growing consensus that multiple independent origins and/or serial endosymbiotic events 20
best explain most recent data (Bodyl 2005; Bodyl and Moszczynski 2006; Sanchez-21
Puerta and Delwiche 2008). 22
17
The understanding of the evolutionary history of chromalveolates has recently 1
been further complicated by the unexpected discovery of so-called ‘green genes’ in 2
chromalveolate genomes. Whole genome sequencing and EST studies have revealed 3
that the genomes of chromalveolate species encode 100s or 1000s of genes apparently 4
derived from within the green algal lineage (Moustafa et al. 2009; Tyler et al. 2006; 5
Woehle et al. 2011). For example, the genomes of the diatoms Phaeodactylum and 6
Thalassiosira reportedly contain thousands of genes whose phylogenetic affinities lie 7
within green algae (Armbrust et al. 2004; Bowler et al. 2008; Chan et al. 2011; Moustafa 8
et al. 2009). Putative ‘green genes’ (albeit fewer in number) have also been detected in 9
the genomes other chromalveolates examined (Cock et al. 2010). The presence of 10
‘green genes’ has lead some authorities to speculate that the last common ancestor of 11
the chromalveolates once harbored a green algal symbiont that was later replaced by a 12
red algal symbiont that gave rise to the chlorophyll a + c-containing red-type plastids 13
that characterize most extant chromalvelates (Armbrust 2009; Dorrell & Smith 2011; 14
Frommolt et al. 2008; Moustafa et al. 2009). In short, the green genes found in 15
chromalveolate genomes are hypothesized to have been obtained via endosymbiotic 16
gene transfer (EGT) (Huang et al. 2004; Reyes-Prieto et al. 2008; Slamovits and 17
Keeling 2008; Tyler et al. 2006;). 18
Other studies – implicitly or explicitly – imply that the green phylogenetic signal in 19
chromalveolate (particularly diatom) genomes may be more apparent than real. Biases 20
associated with heuristic phylogenomics pipelines needed to construct across genome-21
level trees and the uneven distribution of protein sequences for eukaryotic taxa have 22
been previously described (Stiller et al. 2009; Woehle et al. 2011). In this study, two 23
18
chromalveolates, the nonphotosynthetic stramenopiles Achlya, Thraustotheca, were 1
canvassed for proteins of putative green algal origin. These proteins were annotated, 2
combined with homologs from other oomycete genomes or expressed sequence tag 3
(EST) databases, and homologs representing all other available eukaryotic taxa. The 4
phylogenetic trees obtained were used to (1) determine if nonphotosynthetic, aplastidic 5
oomycetes encode green algal genes similar to those found in diatoms and other 6
chromalveolates. Note, that if oomycetes are ancestrally non-photosynthetic then their 7
genomes should not encode ‘green genes’. (2) Second these trees were used to 8
critically reassess the veracity of green genes found in chromalveolates in toto. 9
10
METHODS 11
12
The genomes of Achlya hypogyna (ATCC 48635) and Thraustotheca clavata 13
(ATCC 34112) were sequenced and assembled yielding 17,430 and 12,154 predicted 14
proteins, respectively. Green genes possibly obtained by HGT or EGT events were 15
identified using evolutionary gene network (EGNs) analyses as described in Bittner et 16
al. (2010). In brief, all sequences were BLAST-ed against one another. Sequences 17
were connected in the EGN connected components graph when they showed a 18
minimum similarity, BLASTp score < E-value threshold, and sequence identity score 19
and BLAST identity percentage equal to or exceeding user determined limits. For 20
example, an EGN network with user defined parameters of ‘1E-20 at 80% similarity’ 21
connects sequences that have BLASTp scores below 1E-20 and sequence identities 22
equal to or greater than 80%. 23
19
In this study batches of networks were separately constructed with minimum 1
threshold protein identities of 35, 45 and 65% and E-value thresholds of 1E-20. 2
Networks including oomycete proteins and one or more protein sequences derived from 3
representatives of (1) the green algal lineage (GAL) or (2) Fungi were selected for 4
further investigation. Annotations for candidate HGT/EGT proteins in the Achlya and 5
Thraustotheca genomes were then refined using NCBI’s conserved domain (CDD) and 6
KOG databases (Marchler-Bauer et al. 2007; Tatusov et al. 2003) and then used to 7
drive BLASTp searches aimed at recovering more distantly related eukaryotic homologs 8
from GenBank. Homologous sequences from representative all available eukaryotic 9
lineages were selected and aligned using “Geneious Alignment” with default settings in 10
Geneious v5.5 (Drummond et al. 2011) and manually edited as necessary. Thus, each 11
protein alignment included all sequences in the EGN of interest, as well as a number of 12
more distant homologs from other eukaryotes. Maximum likelihood trees for each 13
protein alignment were constructed using PHYML (Guindon et al. 2010) with the WAG 14
substitution model (Whelan & Goldman 2001) to account for heterotachy and 500 15
bootstrap replicates. Baysian posterior probabilities were calculated with using the Mr. 16
Bayes plugin for Geneious and run with default settings using the WAG substitution 17
model. 18
19
RESULTS AND DISCUSSION 20
21
Because they are ancestrally aplastidic, oomycetes are a perfect foil for 22
examining the hypothesis that chromalveolate genomes harbor varying numbers of 23
20
green genes acquired via EGT from an ancient green algal endosymbiont (Dorrell & 1
Smith 2011; Moustafa et al 2009). Genes of cyanobacterial and/or red algal origin were 2
originally reported for the genomes of Phytopthora ramorum and P. sojae but it has 3
since been demonstrated that these genes are very unlikely to reflect cyanobacterial or 4
red algal contributions to these genomes (Tyler et al. 2006; Stiller et al. 2009; Woehle et 5
al. 2011). 6
In this study 12 protein-encoding genes encoded by the Achlya, Thraustotheca or 7
other oomycete genomes were examined, which, based on EGN analyses, are closely 8
related to genes found in green algae (Table 1). Three exemplary EGN networks are 9
depicted in Figure 1. These networks indicate that Phytopthora spp. share one or more 10
copies of the phosphate dikinase (PPDK) gene in common with the green algae 11
Chlamydomonas and Volvox (Fig. 1a). The PPDK gene is, however, absent from the 12
genomes of Achlya and Thraustotheca and this observation – coupled with the current 13
understanding of oomycete systematics – implies that PPDK was likely acquired in the 14
Phytopthora lineage following the pythialean/saprolegnialean divergence (Beakes & 15
Sekimoto 2008; Sekimoto et al. 2009) In any event, the PPDK network clearly 16
demonstrates a putative green algal gene in Phytopthora spp., that is unknown in other 17
oomycetes. If Phytopthora spp. PPDK genes were acquired via EGT, then this 18
observation is most parsimoniously interpreted as a recent event – not one that can be 19
associated with the presence of a ancient green algal symbiont. All oomycetes 20
examined encode single copies of eukaryotic translation initiation factor 5B and an 21
aldehyde dehydrogenase whose most similar homologs are putatively found in the 22
bryophyte Physcomitrella patens (Fig. 1b, 1c, respectively). 23
21
Maximum likelihood (ML) trees for six of the 12 oomycete proteins of putative 1
green algal origin examined in this study are depicted in Figures 2 – 7. These six were 2
selected for demonstration because they are the most taxon replete and best 3
supported; trees for the remaining six proteins are equally problematic, or worse (see 4
below). 5
A tree comprised of DEXDc homologs is presented in Fig. 2. The EGN for 6
DEXDc implies a green origin for this gene in oomycetes, specifically uniting oomycete 7
homologs with the sequence for Chlamydomonas reinhardtii (not shown). Note, 8
however, in the tree that the C. reinhardtii DEXDc terminates a very long branch and 9
that when other eukaryotic homologs are added the oomycete/green relationship 10
becomes less clear. In fact, this tree implies that oomycetes share a common ancestor 11
with the Opistokonts (fungi and animals), a result clearly at odds with current 12
understanding of eukaryotic systematics. In summary, (at least) two phylogenetic errors 13
are apparent in the DEXDc tree: long branch attraction and a topological error that can 14
likely be traced to problems associated with taxon sampling, i.e. clear homologs to the 15
algal, plant, oomycete, and fungal DEXDc genes have yet to be identified in other 16
eukaryotes. The same issue – taxon sampling – specifically the differential distribution 17
of homologs among eukaryotic lineages also plagues the RPB tree (Fig. 3). Bearing in 18
mind that protein sequences for animals, fungi and plants far outnumber those available 19
for other organisms, the RPB subunit II tree implies that the alveolates are sister to a 20
clade including stramenopiles (brown algae, diatoms, and oomycetes), animals, and 21
green algae + land plants (Fig. 3). This topological error is likely compounded by the 22
observation that the alveolate sequences terminate long branches whereas the 23
22
embryophytes terminate shorter branches, and heterotachy is a well-known source of 1
phylogenetic error (Kolaczkowski and Thornton 2008; Pagel and Meade 2008; Philippe 2
et al. 2008; Shalchian-Tabrzi et al 2006). The ALDH tree implies that the stramenopiles 3
are not monophyletic; green algal sequences are nested within a clade including 4
sequences for alveolates and stramenopiles (Fig. 4). 5
These same types of phylogenetic errors are demonstrated in Figures 5 – 7 6
and are not repeated. What these trees clearly demonstrate, however, is the pervasive 7
influence that the vast number of sequences available for fungi (80+ complete 8
genomes) may have on phylogenomics studies (cf. Stiller et al. 2009). The TOR-9
containing kinase tree suggests that green algae may not be monophyletic and that 10
green algae and stramenopiles are, again, sister to animals and fungi (Opistokonts) 11
(Fig. 5). The unorthodox relationships among green algae, oomycetes, and fungi are 12
also recovered in the YAK1 tree (Fig. 6). The ALS tree is equally vexing and seems to 13
suggest that the chromalveolates (in toto?) may have obtained their copy of this gene 14
via horizontal gene transfer from fungi (Fig. 7). 15
16
Green Genes in Oomycetes and Other Chromalveolates? 17
18
On the basis of the data collected, the notion that chromalveolate genomes encode 19
hundreds or thousands of genes derived from green algae is false. 20
Critical analyses of protein-encoding sequences from oomycetes and other 21
chromalveolates of putative green algal origin yielded trees seriously compromised by a 22
number of obvious and well-known sources of phylogenetic error. These included at 23
23
minimum biased taxon sampling, long branch attraction, and heterotachy. This 1
argument is bolstered by the curious fact that so-called ‘green genes’ can be detected in 2
oomycetes even though these organisms are ancestrally aplastidic. These results, and 3
those of Stiller et al. (2009), suggest that these biases are so prevalent at this time that 4
broad-scale evolutionary scenarios drawn from phylogenomics studies need to be 5
interpreted with a higher level of skepticism. 6
7
24
Table 1. List of 12 annotated proteins from the Achlya and/or Thraustotheca proteomes or other 1 oomycetes found in EGN connected components graphs clustering with homologs from green 2 algae. 3 4 Protein Annotation
TOR-phosphatidylinositol kinase
phosphatidylinositol kinase, putative target of rapamycin (TOR)
Yak1 PKc-like superfamily, Yak1-like protein kinase
acetolactate synthase (ALS or AHAS)
TPP_AHAS[cd02015], Thiamine pyrophosphate (TPP) family, Acetohydroxyacid synthase (AHAS) subfamily
DEXDc DEXDc superfamily, premRNAsplicing factor ATPdependent RNA helicase PRP16 putative
RPB RNA polymerase beta subunit.cd00653: RNA_pol_B_RPB2
RRM RRM superfamily, PREDICTED: cleavage stimulation factor subunit 2-like
RRM2 RRM superfamily, PREDICTED: similar to RNA binding motif protein
Sm_D1 Sm-like superfamily, small nuclear ribonucleoprotein D1
Sm_E Sm-like superfamily, small nuclear ribonucleoprotein E
thioredoxin peroxidase thioredoxin-like superfamily, cd03015: PRX_Typ2cys
threonine protease threonine protease family T01A putative, cd01911: proteasome_alpha
ALDH ALDH-SF superfamily, cd07084: ALDH_KGSADH-like
5 6
25
1
2 3 Fig. 1. Three examples of putative green genes in oomycete genomes based on EGN analysis 4 conducted at 65% protein identity. (A) All species of Phytophthora in this analysis share a copy 5 of phosphate dikinase (PPDK: P. infestans gene ID 03724) with Chlamydomonas reinhardtii 6 and Volvox carteri, two microscopic green algae. Note that PPDK is not encoded on the Achlya 7 or Thraustotheca genomes. (B) The moss Physcomitrella patens shares both eukaryotic 8 translation initiation factor 5B (P. infestans gene ID 20386) and (C) an aldehyde dehydrogenase 9 (P. infestans gene ID 00034) with all oomycetes included in this analysis. 10
11
26
1 Fig. 2 DEXDc ML tree: Oomycetes are shown sister to animals, sharing a common ancestor 2 with fungi. The phylogenetic errors demonstrated include long branch attraction and topological 3 error due to sampling bias. 4
5
27
1 Fig. 3. RPB ML tree: Alveolates are shown as sister to a clade including stramenoplies, animals, 2 and GAL. Long branches in the alveolate clad and short branches in the GAL, stramenoplie 3 and animal clade is indicative of topological error due to heterotachy. 4
5
28
1 Fig. 4. ALDH ML tree: Stramenopiles and GAL shown as not monophyletic. Long branch 2 attraction between GAL, stramenopiles, and alveolates is likely responsible for phylogenetic 3 error. 4
5
29
1 Fig. 5. TOR-containing kinase ML tree: Stramenopiles are sister to GAL, shown sharing a 2 common ancestor with animals. Heterotachy and topological error due to sampling bias are 3 demonstrated. 4
5
30
1 Fig. 6. YAK1 ML tree: GAL and stramenopiles shown sharing a common ancestor with fungi. 2 Long branch attraction between GAL and stramenoplies, heterotachy and topological error due 3 to sampling bias are demonstrated. 4
5
31
1 Fig. 7. ALS ML tree: Two clade tree shown making inferences about the relationship between 2 the two impossible. Phylogenetic error is likely due to abundance of available fungal genome 3 data (sampling bias). 4 5
6
32
SUPPLEMENTAL INFORMATION 1
Table S1. Selected hypothetical proteins (n=16) from the Thraustotheca clavata 2 genome possessing putative 5’ transit peptides. Chloroplast transit peptides predicted 3 using ChloroP (v.1.1) are shown in bold face. Transit peptide sequences predicted 4 using SignalP (v.4.0) are underlined. 5 6 >THRCLA_02069 7 MVRISALLGTFALIHAQTTTAPPASASNSWTMTTVNSIQARVVSDAATWDATNKKFG8 LVMKQNTVTFPDQYRAAMDTVNTASVEGALFYVQTEGINKQFDVNCMRKTNMSYIWF9 LNVTIVQPTFAIAEYADNGGVVPEYGKFIAMDNGQCTPLDTKGTMSDECMTLGGLNYH10 ANIGPFIGGEPRKEHLLAKYPDNIWFSYPNSCFTKTFIAKDTKCREAQKGGLCPLGVQP11 DGIKCTYSFDILGYIRIDELVGITNLTNSQTGQKYKDRVEFCKDSKVEFDFSTMKSDLTF12 WDNPTDEAANTNRTTKMLELYNNLIKTGTGDAAYMKSLPTAAELTAKNPPCWKNSPIC13 ATAEFGCRRKLTAQICEKCTSASPDCKKPTSSDSVPPKLTKAVAPPLPTDASGKTTVP14 RNPTGAGGNGNAAAAESSASSLVAFTSLIITLAALFA 15 >THRCLA_03737 16 MKSTFVLLAAISLVNASSSTKLRGAAPCPNSNSGSSDNSSDYSGSESNWDSGSGSD17 WDDCGSGSTSTSDSGSNDYPSNWDSNSGSDTTEEPATYAPAPTSAPTSAPTETPAT18 SKGTLKEQIIHQTNLIRAAHGLGPVKWNDELAAKMQAWANSDPQQNGGGHGGPPGN19 QNLASFDVCNDNCMRMTGPAWAWYSGEEKLWDYDANKSRDGIWETTGHFSNSMDP20 GVNEIACGYSTFYNPQIGHDDSLVWCNYLGGNNGVIPRPRIDQATLEKQLTSAY 21 >THRCLA_03876 22 MNLKAWILSVAIASAAAASGSSSGSGSTTDAPLTQENLSSRPGLCNTSKDCAKYTKG23 SNVYSCIAVKSNIVNLTTLKQCVLGDGCSGGKAGSCPTFTSWPQKFRQVQPVCAFVA24 VPNCNSAVNSQGQVVSVRSLREQAAKPGNVTCFQAKFGSNSSSSDDSATVYGIYQCV25 DKKLYAEKNLGYLDNTPKQLQSCAGNVTVVNGQSVSNVLCNGHGTCVPQTDFSDIYK26 CLCSTGYSDKDNCGAATGNVCSAFGQCGNGNCNPDTGKCVCPYGSTGDQCSKCDP27 AQNNNASVTNMCNGNGKCGIDGTCQCSDGYLGTNCETQIKKNSTASSATGSTTSSKK28 SAASGLHEASIAIFSIATIFAAALI 29 >THRCLA_04285 30 MQIKSIIATLTLAALAQADNNNCEKSCTKELSPLCASNNETYNNLCLFQIAQCQQPTLTI31 SANQSCSTNVKFCTRLCPTVYQPVCGSDNTTYPTECDLKNKACNNPSLTVTKQGACD32 NCPKACLEILAPVCGSDGKTYDNTCFLLKTACANPSLNLTFVSTGSCTNGNNTTTTAPP33 SGTTLPPSGTTLPPTTSGNPSTTTTPPTTKPASSATTAMLSLMSAAAIAITYML 34 >THRCLA_04386 35 MKWQVALLSLVTSGIAQDHCGSTTVPTIVPTPAPTLAPTPAPTPAPTPAPTPAPTPAPT36 PAPTPAPTPAPTPAPTPAPTPAPTPAPTPAPTPAPTPAPTPAPTPAPLAVATATWTNLW37 SDIVQVATDNTQICIRETNGDVDCKPWSTDSSLPTVYGGHSSNFLATGGGWSISTVNN38 VNYLVVISPLYNANVMVLDEAILYAATDGATCCITTSTFRCASQKLDMTFVKMTDKYITS39 SSIYNAVIYGVDAQGKLYKGSTASISTGVANWQEVSTPCPFTQVSYDGTTLCGLYAST40 NTIVCTSGTLSLQPNWVALQSNKWKQFSITQSYIYAVDTSNNVQRLQISQPIAVAP 41 >THRCLA_04952 42 MTLASSPTFSRPLLLPPLTSALSPSIAQQMKRQHECEGGGSVKRHCSTFPYMEMPRL43 PSITQPSSHIGYLSESYYPSPTSLPMLPPASTLLQQATRKSMDLVPSNAYAPTLPEPCT44 LYKSNENTKPSPSNEEVRGECLDAQCHNSVKHRGYCKLHGGARRCDVPGCPKGVQG45
33
GNLCIGHGGGKRCRFPGCSKATQSQGLCKAHGGGVRCKYDGCNKSSQGGGFCRRH1 GGGKRCSVAGCPRGAQRGTTCAQHGGKAQCMIDGCVRADRGGGYCEVHRKDKVC2 RQGYCNRLARIKCEGYCTQHHREFCITSPPQ 3 >THRCLA_05863 4 MGSVLVLLFSPLHAWLTSSSSSSLPCLQTSFLLSQNASLDQIATAQPRINDAIVSFLAQ5 SNVQSRIQWTNVDITTSTIDGNGPAIVQMCLYVPPNTSVNQVAMAMSATVNWSGLKTS6 ISTLHRRFQLFDLTTPLQVLNQVTQYNFQRIPFPYVQWYLVVQGRFDYFWPQIKIKHAIA7 VLLNISSSSVIPQDIIFPPYDAYNDIATILPFAITQVNSSTFARTANTLTGPLQDILALHGILL8 LTQFPDPNGNGKLQQSVPWPEYPQLDPTSFYPFHNWTPVPNSFVVKLIYGGLLTLTN9 MSSVILQVLDVLDSPQTANFTDFQTLTLTYPPNNGTATFESSRYNTLDFIVAGDRSTLE10 ANQQTLGESLYQIGVSIFDVIDINSTMQTAQWYPYMQLDCPYNLSALASIIQRIALAAFF11 SIPLSSIQLIEIATNSTTFEIACNDTLEQRYLKKQLKETTRWSTVMNNFTANSAFCTIGGE12 SLAYPPMFPGSTYGWSQPSSSMDNTCSVNTIELTACDQCDRYLNAVCFTNPNCYQTQ13 TTLLSQLLVSSNASSVFQQLSLSTSANTKTLNTLALYYSCIAAFQCLIAPNTSIITSDEVYT14 IDINANGANFSTTLYYPQDDIYLVLNDQTTLEEIQINLSSSISNSIFVNVSGTSSSFNVTM15 DSVVIPFQLPVIAYSTVPATIQRISASIPQLVFLSNSSNDTTVLLNGKCTTCLTQMDECK16 MSPSCPSIAICWSNVVESAISQLDSVYSTLEISTQLISCYENASLEDFEMFLRVQKCLLQ17 SSCPISPTLESIVKGTMIVLRSTTGFQTIELTPTPAVTLTIGTESIILSSNSISGLQATMINFL18 SPLCQASIQSNTANLTIQFNDFGAPILPTINGTIYSQMPRIFLDRMPLDSSRFGFSYQSY19 KQLSPSSLPNAFTTTLNSNCQMCQNLFDQCLLSSFCASIISNFQNTIAGATNAFIGWSV20 ALQRLSFDIPEWDQFAQTLSCFEIHNCPINSTISMLKNGRMLLLSSTPVVLSVTFSSSPF21 EAAIYVQRFRQPINVSSNSSAAYIQGQFQMNFGSLALTNVSITNTSMELSLNSYYGPTP22 EFMVTSSEFSNKTIILGTSMVSVVSYSPAAYFPY 23 >THRCLA_06099 24 MKFALVSSLAVLASAQTNNSSAGSNSNVNCPLQFTSACANTQECGTLNGYPLECQV25 YGSVKQCVCSKENANCQNSTNIANTIPQFGVCTGGKQCAGSGFKALQTPVRTCSEQL26 VCIPQYASGNELQSICHTCSSCKQQNKPDATGRLIFNCTQICPLGQGDPIVTIPPVTTAP27 TNSTKKNDSSKGSGSTAGSKPKSAATSIVAGVATVAIVAIASLF 28 >THRCLA_07047 29 MILINLLFGLRLCTDGVSLLQQQVPRKPSKRTKQSRCKHVPFVASTALKPTHETLAPL30 MPLVVYQEVTENDMAHLISLVDNQDNQEDNEEITENVVADVFVPLVDNQVSQEANEN31 VVVDFADNQDNQEANENVAEFEPLVNYQDNQENVAEFVSLVDNQGSQEASENVVVE32 LVPSVVCRDSFEPTEEDVAAVLHGRFAANQAALLRVSSFQPADDRSLTAIQLIRYFELY33 HLVRMDYNQLRHLEPSRLEKIQLVRLSILERQAIEAMLSDVAELWSRQPNDVSSAKKLQ34 WFKNLQYGLMWDMLELLEHQKPDHHCARGLCPQLYQEKLDIIYSE 35 >THRCLA_08011 36 MKTIFLTTALLASTSCALQMTNKERNEILDELNKWKQSAVGKAALVHNFLPSSQRQEG37 LSIDAKQDLEITRFAHTKKVVEQLNKEHKGSAVFSTNNMFALMSDEEYKKWVKGAFGR38 DHKKRQLRGENIQLELTAEQREASGIDWTSNKCMPAVKNQGQCGSCWTFASVGAAE39 MAHCLVTGNLLDLAEQQLVDCASDAGQGCQGGWPTKALQYITQTGMCTSRDYPYTA40 SDGQCNNSCKKTKLSIGEPVDIQGESALQSALNKQPISVVVEAGNDVWRNYQSGIVQQ41 CPGAQSDHAVIAVGYGSDGGDYFKIRNSWGAEWGEQGYIRLRRGVGGKGMCNVAE42 GPSYPSMSGKPNPDGPTDEPSNDPTDEPSNDPTDEPSNDPTDEPSNDPTDDPSDDP43 TDDPWNGSNDWDWGN 44 >THRCLA_10855 45
34
MIVPVVLMALTGISITGVTLRWCCSHRQKTSWKEERKEPLLATPVPLPRIKTDVFIERSI1 AMDVPLMETCSGCGAWIDPSLAAIANGLCVVCSYQTIPSLEIDIDENISENDETSNDKDS2 DKPILTDEDTTADIESPKDVEIQIEESNFEDEMEDISTTSQDMVIPISQDNNCNEGDEDV3 EIAEEALALVQDMWDIAYQAHLGVGDDPTADIFVEMALDLDATAEAIKKEPHLLSESFH4 FLSLSLASLMELVPEAWVAHVEATELKFEALQFRYHSKLTVENCLDLATHLYELVECAQ5 EFGVDPAVASSLMDGLEELVEAIEETPCELVSWLAYLAATVKLLKSYQRDFEQAEMWD6 TVVECERNLEPLEMHCWEIYSPC 7 >THRCLA_10997 8 MKASLCIATLAAMGSIASSRNIRHHAESVMGNPVQRRSESTRLPTHPLTGYWHDFPN9 PAGDTYPLTQITKDWDVIVVAFANSLGSGKVGFDVDPKAGSETQFIKDISTLKAAGKTIV10 LSLGGQNGAVTLNDATETANFVSSVYDLIKKFGFDGIDLDLENGISKDLPIINNLITAVKQ11 LKQKVGDSFYLSMAPTYGGIWGAYLPIIDGLRNELTQIHVQYYNNGGFVYTDGRTLNE12 GTVDCLVGGSVMLIEGFQTNYGNGWKFNGLRPDQVSFGVPSGTSAAGRGFVTPEVV13 KRALTCLVQGVGCDTVKPPKTYPTYRGAMTWSINWDSHDGYVFSRPARQALDSLGG14 SPPQPNPTAVNPTDAPNPLTNPPTSRPTNTPTVTPTQSPRPTSQPTSLPTSSPSSVPTI15 NPTPIPTSVAPQPTQAPSSSC 16 >THRCLA_11248 17 MANTIQWLFIYCVIVASQGPPNNGERTCSVTLGGPVSQTSTAGTMSFCTAFPQERCC18 LPVHDEYVKSTFYALLDSGYICASATNTAIAHLQTMFCLACDPSMSLYLTPPRNTTFFS19 APQTLKVCRALAISFKQHIDAVSPYYFSDCGLTYAGDRNNLCIPKTAISPNMVFPGCSE20 GQNICYSTTQGYYSPIWYCSSSPCGPDTPFGLNDIPCSGPTCTPAFQFLNDNRAAKPP21 FFEPFAVEIIDESTCAPGESSCCMTDSSIVPTS 22 >THRCLA_11271 23 MKTTAFVLALASTAAASSPCTGSAVITAVTPLIAQATTCSTDSGFDLVALISGTTPTDA24 QKQKFLTAESCKTLYASVQKSLAGITPACTIGDIDTSGWSTVSMDKGLDALIKSLPSLLA25 SSGATNSTSNSTANSTISSTTVSPSSTTAAPAKSGVAATGVTIAAVALTTAILHLNANKQ26 QEIHEHLRLTIKESDVETLGEVMSMSLIPAAEAHQFI 27 >THRCLA_11391 28 MKLSILLAAFGVVASSSIPKHTYKCNDGVCVQTPLNGAGVSLGSPLLSLRMCEMTCG29 AGSLWPYPASVSLGTTATAIDTNKVSHSIKINGAEATSTLTNSIVQTFNEGVKAKTKWV30 RGQSEIGAISHSIYGTISSNNEVLGQDTDESYELSIDGPRVKINAATIYGYRHALTTLNQL31 IDYDELTNSVKMISKATISDKPAYSHRGIVLDTSRNFYPIESLKRMIDTMGANKLNTFHW32 HMTDSSSFPIEINGEPRLTTYGAYSAEQIYTQDQIRDLVQFAKARGVRIIPELDAPAHAG33 AGWQWGPKAGYGDLTLCYGADPWMNYCLEPPCGQLNPLNKQVYSVLDTVYKELTSL34 FDGDVFHMGGDEVSIPCWNSSKVITDHLKDTNKPGAFFDLWGDFQTKAAAMLNKKVM35 VWSSDLTTDPYLKYFEPNNTIIQLWGGSTDGDATRITSQGYDVVASYWDAYYLDCGFG36 GWVSKGNGWCAPYKSWQVIYDLDITANMTAANAKHVLGSEVAMWSEIADAHVVETKV37 WPRAAALAERLWTNPKTDWKSAMGRMRIQRDRIADAGIGADAVHPLWCRQNPGKCQ38 LV 39 >THRCLA_11516 40 YTCVAVQTAIAGIALASQCVLGTTCGGNSAGQCPTFSSWSSSYQKIQPVCAFVNVTN41 CVNFIKAGSEAKATSGSGSTSTVNCYQATFSANNISQVVSGIYKCVDSGLYVSQNLGAI42 KNLTTTQMDVCAGNLTTSVGALCNGHGTCAPTAAFSSKYQCICNEGYSATDNCNVAT43 SNVCNAFGSCGAGNTCDTTSKQCSCTTGTTGPQCSLCDPTASSSVVCNGNGVCSSS44 GTCTCNSDYTGSLCSRTATTNSTGSNKSSSSSHLVASLATIATCLLAILM 45
46
35
1 2 Table S2. Selected hypothetical proteins (n=30) from the Achlya hypogyna genome 3 possessing putative 5’ transit peptides. Chloroplast transit peptides predicted using 4 ChloroP (v.1.1) are shown in bold face. Transit peptide sequences predicted using 5 SignalP (v.4.0) are underlined. 6 7 >ACHHYP_00269 8 MVRTLSLLLLAAGVAGQTSTTPVPTPAVSNPPFTMTLVNSIQARVVAEAATWDETNQ9 KFGLVLKQNTNTFEERYRAVMDTVNTASVEGALYYVQTEGIDKPLQTGCMRKTNMSYI10 WFLNITMVQPTFAIAEYQDNGGVVPEYGKFVAMDGGLCTPVGTETPLECLTYGGLNFN11 KNLGQWVGGEARKKNGRANYDDNYWFSFPNSCYTMRFDAKTKACRDLQKGGLCPIG12 TQPDGVKCTYSFDVLGYLAIDDLVGITSMKNTLTGQNFKGFSEFCKAGKTEYNFADSS13 SDLTFWNDPLEPAANANRTKVMMQKYNDLVQNGVGDQKHMKALPSVEELTKANPPC14 WKNSPRCATAANGCRRKLLSQICEVCSAPADDCKKPGPNDKAAPMLNKQFQPALPTD15 ATGNTKQPRAPNAAPLDAPAGGAGGNVIKGSGAAATSLILATAVGLVALAV 16 >ACHHYP_01095 17 MLARLAALIGVAAALQVPFTTYECVRGRCEPRPRSFSPPDSASSLRLCEMTCGAGNL18 WPLPTSVSLGTTTRVVSVDYVSHTVTFLDNSVPISPLVGAIQRIFDNTLALKATECALAS19 VGGAELAVTASIESGNEVRDYFRTFTMAADDNTMVQELELETDESYTLTIVDGAATIHA20 ATVYGYRHALTTLSQLIEYDELSHDMHIISAVTITDAPHFAHRGIVLDTSRQYYSVPAIKR21 LLDGMGATKLNSFHWHFTDTASFPIEIKGEPRLTAFGAYHPRSVYTQQAMRDIVAYAR22 ARGVRVIPEVDAPSHVGAGWQWGKDAGLGELAVCFGHNPWTEACVEPPCGQLNPF23 NPHVYDVLETVYEELNEIFDSDVFHMGGDEVHLGCWNMSAAVTAHMTDRSPDAFYRV24 WGRFQMQARQLVGEKKIAVWTSDLTNAPYLRKYFDPASTIIQMWTLSTGSDAARFTA25 QGYPVIASYYDAYYLDCGFGNWLLKGADWCTPYHHWSVLYDLDVLHNVPAAQRNLVL26 GGEVALWSEEVDEATMDAKIWPRAAAAAERWWSNPVNGTWKDAIDRMRIQRDRLVD27 IGLQADALQPLWCRQNAGDLSQGSGISISATVKSKSEALTVDTDESYELSIDGPKVSIN28 AATVYGYRHALTTLNQLIDYDEISNSVKMIAKAKIADKPAYSHRGIVLDTARNYYSIDSLK29 RLVDTMGANKLNTFHWHFSDSSSFPFEIKSEPRLTSYGAYSKDQVYTQDQIRDFVQFA30 KARGVRIIPELDAPSHAGAGWQWGPKAGYGELTLCYGSDPWMDYCLEPPCGQLNPL31 NDHVYDILKTVFEEMHGLFDSNVFHMGGDEVSVPCWNSSKVITDHLKNTTSNAPFFDL32 WGTFQTKAGALIEKANKKIMVWTSDLTTDPYLKYFKPSNTIVQLWGGSTDGDAERLTS33 KGYEVVASYWDAYYLDCGFGGWVSKGNGWCAPYKSWQVIYDLDVRANLTATNAKRV34 LGSEVAMWSEIADEKAVEAKIWPRAAALAERLWTNPKTNWKSAMTRMRIQRDRIADA35 GVGTDAVHPLWCRQNPGKCTLV 36 >ACHHYP_01226 37 MTALADAVWLAVMAFLDGQDLSRLMRVSRAHWRRLQAQVRRWREIQLGLGLGHWV38 QRNVRLTINTQVQEAQSLAVQRSPDARVPPRVETIQKELGPIEAERSVHRLTATTPLFT39 ATQQAVLVLSFDCTSADTKPLLVHTSQRARTLYTTLTLTIFDRTLRRHVYHKASGDLAT40 VPVAEKQAWTNAGATLRCDVASNDKSCQVQLGLPARLDGKIDCYHIERVDFTLHKREL41 YPVFSLPLEPSLPTCWIHLQFHDLARAQCLARVSAPCHALLEMAASRTDDTNHPARRT42 AVEQLEVATFRSTQPTSLPDISSLAKPGMISMVISGPERHQAFYHTAFGHSGATRKSDS43 AHVLAATWVPGVLEFAMYPDTLNRRVLKGIFTLEFAVSGALTSLVVLAQHLSPRRLLRY44 NARVASYSRRPEAERNEDA 45 >ACHHYP_01546 46
36
MVALFLGTAIALALASSATGSFTGLAMPAANSSEPKSGQCKLMKLLPRATQFNVALS1 PRHYGRGGHCGRCVQTQCDRCAASAPIIAQVTDRASDVGLSKPMLRALFGSGAPSAV2 TWDFVDCPVNDPIALCTKPRNTSAYIIYVQPTNTVAGVQNMTIDGFRGRLTNASYHFKA3 PMPANWSNVRVSMKSFTGDAIAASVALRPGRCVTIPHQFSPSPAAASGTPAVIDYDGD4 EDADSITVPPPYK 5 >ACHHYP_02169 6 MAWIVVLGILAHVATALQSSLCATSSAFSPPGCHANRRLATWSRAIVRLNAGGHVCT7 GWFVGSEGHILTAHHCIHKARAVEVVVEETPAQTCPPRTIRGRMTTGIDVVAFSVALDY8 ALLRPLNRSVRGPVHLQLHSSAADIVGLEAIVAQHVDASSPVVLSEAGRIVSTTFAGCG9 RRDRLAYALDTKASASGSPILSTATGAVLGLHTCGGIHCHGKSVPMWIVIGCSSEPGH10 WNSGAVAADVVADLRQRHHLPPDAVAHETLSAPTPSTIIVERGRLVQRAANTTSVDAY11 LLTMAMPGRVTLDLLAWTMDAQGRWHDLRRDCDGSFFDTKVILAVVDDADGRPLLRR12 IAENDNDTRHQGMGDGSIDNRDAFLDVYLASPGDYYVLVGTAAMLLPAVFAPRLSAPT13 DGGQHLYGCGNTRATEANYNLRITTDDGTLQRIEAPFPRTAACSSSARKCPAAHADTA14 LTLDAVVAGTLHRTYSSGTSMDHISFELTKAGRIAIDVVSYQEHTNGSIAIDGLHDVCGR15 AYLDTVLYVFGATIPSGEYLDPAALVATASDRPPTHVASQRYRSVSTRDPYVEVDLPA16 GNFTLVVGQQPLSLFEAVRVLYPGSRETDAPLLCGRPHPFGHYHVFFWVQHRRMLSA17 TMPGSFDHAACTHEVCSDSML 18 >ACHHYP_02305 19 MKFTTLLVATVFGQNTTTAPSSAPTPAPTKCLLQFTSPCKSSSECGDLNGFNLTCIKS20 GSNKQCNFNGGSTVAKDNQFKAADNLVYQFGDCSTASCTTGHGFTEGLPTTVTCQE21 PLVCVKEINDNPGVVLKSQCHTCGSCKAQSLKDTRFDCSKVCPLTPAPTTKAPKVPGA22 TGSAASSGSGSETSAPATRAPKTGTPAPTAASSASTALVSGIAVVALAFAQLC 23 >ACHHYP_03044 24 MAGLIVGILAAVGTFSGSGESISTGTSSTPAPTTHTPTTLSPSPTTKPTTVTPTPTLAN25 GLCPLRGMYLSGTSCVACPTPKKTFSVFWESQVDCSTFATSSAAAYVTHIYWSFALID26 PTTGTVSSTFQGSSATLKACIAAARAKCIKNYVSIGGATMRQTFVALNSSAQLTTFALS27 AAQVVQEYGFDGVDIDDESGNLLAGGDWKANALPNVLVYLQGLKTQLAALPRAATEP28 KYQITWDEFPTSLSTGCDLASGDYLRCFDVRIANIVDQVNIMMYNSASSTDYDNFLNVV29 TPTEWATAMPASKIVIGGCVGPIGTIGGCAFGAAPTATQLKAYASLLDPALHERLSRMD30 LGFMLDLARDELLVLLESEQAHNPGVAVREGEGREDKQQQRRVQREVGAEEVDEAH31 VGEERVEGGVRRDLAGVEQQ 32 >ACHHYP_03052 33 MAAVSNPLLPLQLALADLLERPIHAALDDALRQPSNEQHLHHCVRSLPPSATVDALD34 ASLAFVVHARALLTICSDYLDQHIAPQHALKKITDLLSVSREIANDAEVNATADDADVDE35 AATDDSDQFASPKGEPPVGPWSGSETPAAPTSRQSWWAQIWGGDEDNDSAGDDVS36 APPEEETLPSLPVEVANTIASLAAFPTNLKLQLHGLEALVEYVHGPCCCESVGPLYAAP37 DMLPAVLHAISSLAQSKRAQIAGLSLLANPSSPKANMPMLPANLPTQQVRRLILRAMQR38 FKAHAQIQGLGCLALSNLCRGPAISESHALKARGCRLVWSSWLLALICASSGTSMRAH39 PLTGGPEDMQYAVLDAGSVAVVEAASRRFQDDDRVRKHADMALREMLQKHASRRAP40 QCAFQ 41 >ACHHYP_04549 42 MRARAFFVLAGCATAAASPPLPWQSSCQVCAHTGRCGGASSPIKFCGTWPTGACC43 CSANVNCPTPGVHATCDCGFLADYPVDAALPPVADVLGYNFS 44 >ACHHYP_04706 45
37
MRASVLAIAATVAAAANNQTATTKVFSLEVGTVGVHASRNQDSVLIPCKSNVCVPTG1 SATLEFCRKACNRETGEHDCTTNCACNGTTPGYMCAGICNKAKTADECGSPVFQTCS2 GEDLVPDYECANYKCTNHQRTNYLGANNRCANYERAAYPLPDIHARVRFVKLLNPGE3 AHLIEYYTGLYFGPGQNNANDGFIWNPSVGSIKSISGNSCLDAYVAVDHNVYVHTYPC4 DDSNPNQWWLYDSSLHQLRHKTHSTMCLDADPNDANKKVQMYLCSPGNANQYFDM5 RPILS 6 >ACHHYP_04908 7 MTSVVAVTACLLSWLQRSRASPPVAYSAPNSVAFPAEIVHIKVILSRRRSSVLANGVL8 PPVAPPRRAAHHEGHLAPLTRDLLSDKSGNAPP 9 >ACHHYP_05005 10 MSHCHFAFFVPMLARSLASFTRASRRCFSTEGPFEHRAVSAEVIAELKALYGDRVSTA11 ASVREHHGTDESYHTPSPPDVVVYADSTEEVSKILQIASASKTPVIPFGAGSSLEGHISA12 LHGGISLDLTNMKSVISVEQENMSCRVQCGVTRLQLESELRATGLFFPVDPGADATLG13 GMVATNASGTTTVRYGNMKSNVLGLTAVMADGKIIKTGSKARKSSAGYDLTRLFIGSE14 GTLAVVTEVELRLQGVPEAQKIAVCSFPTIQDAVDTCTVIMQMGIPVARMEFMDHKAIE15 ATNSYSKLNNIVSPCLVIEMNGTPEEIEHHTATVQALAEEYSVQRMSWAATEEDRKELL16 KARHSAWYATMNLVPGSRALSTDVCVPISNLTQVIVDTQADLEASNLVGTIVGHVGDG17 NFHVMLPFLPEDEPAVRAFSDRLVERALAADGTCTGEHGIGSGKIKYLRMEHGDSVDV18 MRTIKQALDPHNILNPSKLF 19 >ACHHYP_05180 20 MYNTADSVAFLSLLTSTVRAITPLPPLQFRVQAKFATGPLPASKPSPSSFISVRFVWNI21 LVRLVVYRRRATPTPVDMAQERTVLA 22 >ACHHYP_05326 23 MHCTFFLSIVTAALAGVAGHVQQRIRSGAVKARGVNLGSWLVTEHFMMPQSPIYQNV24 SADLQPLGEYVVTTALGRAVADPLFKAHRSSWITENDIKEIASFGLNTVRVPVGWWIYE25 DPNDSDWQAYSPGGIQYLDALINDWALKYNVAVLVGMHGAKGSQNGEGHSAPQLPG26 ESHFTDDADNVYTTMQSAKFIMSRYQSSVAFLGLEMLNEPTITPGRVYNIDRTKLIIYYT27 NLYSKLRAICSSCIIMLSPLLNEQYESFGNQWANVLPTGSNNWIDWHKYLIWGFENWS28 MKDIINTGTQWIANDITLWQSRRSAPIFVGEWSLAAAEGILGELKNGTNLNTYANRALA29 AMKEAKAGWTYWSWKVNATDWRSYGWNMQALLRAGVIDLKNA 30 >ACHHYP_05770 31 MSKLSLAFLLHPTALACPPGPEAYVCPLSPETIVCPLSPRVSPASSARAKPKRSPPA32 PRSRPCKEPGCTKYAVTRGHCIAHGGGKRCSVEQCPSGAKSNGLCWKHGGSKTCS33 FPKCSNRSKTYGVCWSHGGGKQCADPNCTKTALRHGFCWAHGGGKRCRTEGCQR34 PAYERNDNLCDVHCAKAS 35 >ACHHYP_06287 36 MQLSHILLFATAAAAQHTLLDSGTPEDRPSSWGSPVTKQIPSAVRFRSSGLCGEAQTI37 DYVDFMVNTDLADIKANATWIGVEICPSVEDVPACPPTSVAEQIPIEVRGKRTTLHWVP38 ATPKVLEPESLYWFIVSSNVENALQAVSWYPGSKRYGTDNDPKSDVASATRMLVPWG39 GMDWVVEPSGGVAPLDHRRVPNAKIVVKA 40 >ACHHYP_06505 41 MIKSFTITATLLASASSLQMTNKERNELIDELNQWKKSQAGKTALVQGLLPPHPKTESF42 DANAKLEAELVRFATTKKVVEKLNAEHNGSAVFSTDNQFALMTDDEFKKYVQGAFGK43 PHKKRQLRGENIQLELTPAQREASGKDWTTSKCMPAVKNQGSCGSCWSFAAVGASA44 MAHCLVSGKLIDLSEQQLVSCASSAGQGCQGGWPNKALEYIAQTGVCTAADFPYTQS45 NGQCKQSCRKNKLSIGRPVDIRGESALQSALDKQPVTVVVEAGNNVWRNYKSGIVKS46
38
CPGAQSDHAVIAVGYGNGFFKIRNSWGANWGEQGYMRLQKGSGGNGMCNVAEAPS1 YPSMSGSPKPNNDDNNMPDDNDD 2 >ACHHYP_06977 3 MKISRVAVIGLLFVAARSTRAQSTSSSTQSNTETTSTESTPFSSSSSSGPAPIVDVIAAA4 IDAGATPKQAAIVAVAADTGASLAAIIQTAVDAGVSPSIASAVASAANSAAGSGADDVTS5 APITTVADAAVDAGATTAQAAAIANAASSGVSGDDLVNVAISVGVPASIASSVASAAGS6 AAGTPAPIADVIQAALDSGASLNQAAAIAVAVSAGVSVDDIQTAAIQQGLPASVASSIAS7 AVQSTIASAAGSTSADALAANGLGVTSASSTSYVPPSEVTPLKLTGAKDPEAASDVNS8 PEAYSFSAPMTSGSTKSSESPLSGISGMFNNIVALVTSAPSPAEEPKPRLRASCRTA 9 >ACHHYP_07400 10 MKTPAFLASALFAVATGERPACGPDTPSPTMTPTADPTFAPTSGPTFPPTPAPGQWT11 SLGGFAHDISFDGTNVCVKNGDGAFCGFAGQPFDQWKPVATQLKDIEQVACAKGVAF12 VWGRSSGDLVMKTINLKTGEEHDAKMQDGESPRQFSTDGSVVCGTTNSRLFGAKVT13 NGALGAYSTISEDHEIYKTAVAGEFLIVAGYDGALQATLLDAENWDTFSFDVVPVDLRA14 REISTDGVDLCIVTYELDIACSKLSSGLEKWTKVPGEWKTVAVSNNTIYGVDFKSSEIRY15 TYLK 16 >ACHHYP_08323 17 MVAWAWLPAAAAVVAATETHWSHLGNASSDRGLRIHTPITRADLHDEYNDAPVTQR18 RLSGSAASLFRAVAGYGFRGLSNAAIFSGVTLDMCASACVTDARCLSFDYEASTCYIA19 HTDRYAYPADFVPRATSTYYEWQGAAATPTIEPNGGRLTSYGAFQLFTTSRAAAMYY20 QFKSLENGTVTVYTLYSPGTTVTLPEYPCVVQAYTTKAGLSDSIVLVSNAFTVYAARYA21 YLVPFYNGLGFHGLVTRVQLDVQGVKRPRPSRVLEFTDINSTLGIGPFRGQLSTINLTA22 YDARLAGFFDAFTGITTTLCPQVESRVAVSTVTYVNVSLQVFQNASRWVLVPAPLYAS23 APGDLVFSSSVSLVEEYLYLCPHQNAKGHAGVIAKVNLRAFNATSHLPFQPAIEMLDLT24 VIDPSLTGFGSCFANRNYGYFVQRRNAAGLAGQIVRVNLDLFAQPALAVTVLNATTFD25 ARFVGFSGAVVYKNVAYLVPFERNKVGLELNPNYKYFPTPTSSIMGRLDLTTFSTVTPV26 DLSVLDVKYACGYFGGFTVSYYVYLVPNMWTTDTTSPGVNPYHGLVARLNTLTMNVE27 SLDLTLVDPSLKGFMRGFAFGRYAILVPHRNGLTTELPVRLNKSQKNNLGTIVAIDTDNF28 TPSGVRYLDLTLALRSQIPNMPDADLRGFIGGGVSGEYGFFVPYFNGVRFSGKVVRVN29 LRKFGEVQVLDMTQVHTSLRGFTNAVFPQLYEPTVTSLWNYVIPDGTQTPYTFITVDV 30 >ACHHYP_09221 31 MVSVTTPSMTLLGAIALVAGQATVAPTTATPSAPSASPTKGPWAFKSVRTVQARVQA32 DVPVWDAAHKEWVAVFPQNTVTFEQRYRAAMDTINTATVEGALFYVQTEGIDKAVQA33 ANGCMRKSNMSYIWYYDIEVVQPVYSVAEFGQNTGYAPEYGPFIAMDNGMCTPTSGT34 TVPQGCMQFTGLAGNIALGNYIGGEPRTKHQYANYANNYWFSYPNSCFTKSFTAKTD35 ACRNSPMQKGGLCPYGTKPDGINCTYSFSVLGYLSIDDLVGITSTVNPQTGKAFSNHM36 EFCKAGKYEWDFTTSTGLPFWADPLNVTANAARSAKMMDLYTAKVAAGVGEYANMK37 PFPKVSELVAQNPSCSDNSPYCAKQPHGCQRSLLGQICVPCSSASPSCKPPTRAFPA38 LPVATTPPPVTDAAGNVVPMSTNLLGQAVPATSSASTVAFSATAAILVLALA 39 >ACHHYP_09519 40 MIVSAIVFAVLASAAGQSPLKIASSVPYALTIDGSAPVSTVISNTRATSLSVHIASMNLP41 PGATLTIGTVDGKDKVVYTGAHTNLVSDYFIQNKVVVSYAAASYSNNTTPLVAIDKYFA42 GTPDAGGLESICSTTGDLSRPAACYATSEPVKYAKARAIARLVIGGSSLCTGWLFGSE43 GHLLTNNHCINNDRLAASTQVEFGAECASCSDGSNNVQLACKGTIVASNVTLLATSSK44 LDFALVKINLNAGVDLSKYGYLQARDSAPVLNEPVWLAGHPQGDPLRMAVATSNNAE45 GAIVSTNVTDSCKDNQVGYLLDTQGGSSGSPVMSTVDNSVVAIHNCGGCDSETPSNG46
39
GIPLTKILAYLRANNIALPKNSVSAAPAPTTAKTTTASPATPAPSTSAPATAAPKPPTFTL1 CSVSNKVISEYYTGLYVAPAGHTANEQFSYSPDTGAIQVQSNGQCLDAYWGGSSFLV2 HTWPCDRGNNNQKWTVANNQVMHRVHGVCLTSVAGSKSLGVAPCNAADVRQWIYT3 NCDTANVRNFVQLRTPRGALVSEWYSSVLAKQPQSSWTELWEINGQQMRSFSGSTC4 LDAYWDNSRFQVHTWQCDPTNGNQQWRVGNSVVAHATHSNLCLDVDPTDPRQAAQ5 VWGCHSATINSNQLFDVVAF 6 >ACHHYP_10824 7 MASISQWLCLSCWAPMSTPKTTMATDAWCGTFWKHMLMSVSVTPPPACDCSTDGA8 TALFFAAQRGHSDIVYLLMSAGATAEESTLGISPKQIAQANGHTIVAAIFDTLPPPLPHRL9 HWERSSVLFLSSFLVYRCNLLLLRH 10 >ACHHYP_11025 11 MHARFFAPVLGTLSLVAGSATTLAVNSSRTPQVNAQVRRLSKRALPRDMGKSSTSA12 QAPEGSSKPDMMKDFPIFLFTIE 13 >ACHHYP_11286 14 MASESTPLLALLELPLLKPTSAETIQGHVTALRASFISGAMRPLAARKAQLRAIRALVE15 DGCEILQAAMWKDLHKHAAETFVTETSSVLLEVQDHLDNLDDWAAPHKVGTNLLNLP16 GSSYIRSDPLGVACIMDTWNYPIMLLLMPLIGAI 17 >ACHHYP_11397 18 MDRLLLLSALATAVAVDDAAPRPSRAPLPTTLVPWGSPLAAPTAPCTWGGRAHALD19 WNLTTSVPGSRQCFPNLFAADQPLEFPYPRSSYNYDLDPPVVGPRVQVQWTNGVTN20 VTAPVAAFDYRTFEMTGDELLFHALPDAPGVYRLAVQAFDWDRASSECRACLAVTDQ21 VRPRATVARAGLCGASTTAPYSPEALAAADDRVRALVRYRATATNNDACSDRRCDAV22 TVAQTGFLSAFPTAVVDGANAAVDAVPDGWLGCLAAPLSARERQRLTTPLALVDDAR23 DYFVALQELYTPFRCGAPPGRPTCAGAASETCALMQAVVLPASHLVARVAVKLKATAG24 HIADPAAAFPGAGYLPPSARHLHLAIPCYPTNASFSSFCADTVEWRVSDLFELSAELNA25 SQPWGFDAAAPLVTWFVQQGPAWVAVADNKRLAFDKFQDTLVFRAMTPCGQVGEDI26 AWTVFSHRAEALSVDAWWNSLWSCGGCNVPKADFSVCRFRFDPTSPLVSAMLHPPA27 SCRDAAGRSCRNGCLARGQCNGRSTAASCGQQAGATWCDARGSALLAAAVPRYSL28 RSLQCVWQYANTSSANWSVAVDVAVDTAFALKLRNADATELSVSCTLTFDPDTGEPA29 VVKTRSLALSLRNCDGPRFEDHALAFVKDRCDASWRPGVGRQPAPRQACAGHLVFP30 STTDAAATVLLTPADDLACCSGPVAAFSCQPLPGHPGLKQCQRADTATALLAAEPQA31 WPPVALAASLALVFVLVRRRRQPSDTDLSRPLIDGDRC 32 >ACHHYP_12628 33 MIVQILALAATASAFTKCHIRHPNRTEVLSTPCPHEYVTELPASFDWRNVNGTNFVTV34 SRNQHVPHYCGSCWAFAATSALSDRVRIARERNSEGKDRVLVTRQVNLSPQVLLNC35 DKEDMGCHGGEGLSAYRYIHENGIPEEGCQRYLATGHDVGNTCTAIDVCRNCEPSKG36 CFPQPSYDTYHVSEYGAVDGEAKMMAEIFARGPIVCGVAVTDEFLNYSGGVIDDKSGR37 TDIDHDISIVGWGVDGSGTKYWVGRNSWGTYWGEEGWFRLRRGNNNLGVETDCAF38 GVPADDGWPKRHTETTSPAKAAVWSGEIKSLLQPSRAQAKSRAPVHFVGGEKVLSPR39 PHEEIDVLALPKQWDWRNIAGINYVTWDKNQHIPQYCGSCWAQATTSALSDRIAILRN40 ASWPEIALSPQVVVNCHGGGSCEGGNPGAVYEYAHRHGIPDQTCQAYVAKDGQCNA41 LGVCETCWPTNSSFTPGKCVAVPKFKSYYVAEYGHVRGADKMKAELYKRGPIGCGM42 HVTDKFEAYTGGIYSEKTWFPIPNHEISIAGWGFDEATQTEYWIGRNSWGTYWGENG43 WFRIKMHSDNLGIEGDCDWGVPIPDGSQPLL 44 >ACHHYP_13722 45
40
MKCFAVLAFAAFAAAATSEQAATTQPATTTAAVTTAAPNTTTVVPLVSTKAPNTTTVTP1 APTTKAVTTVPVTTVKANTTAPVTTVPVVTQTNVTSPDETETPEPVIEQPTDAPLPVPT2 KKKSNATTVPPSASASISMLSVASVAVAVAAYVM 3 >ACHHYP_14385 4 MATSVLALCFSSLTANSTNTPEPKYQTRTVDTVVYESSAKWPKYMGKGSAIQMYTTA5 ALSAQILVSFPETTTVLEKVATVGPLVSLSAVIFFGAKYLGERVITNVTSCRTVGQRGIT6 DAIYLYLDEFLKIQVAGGLRPKTFECYPKGVSALRLISYLKLVSKDENGMCNVKINRTTF7 WLDLGKAQVHQEQSLKILLDGKPLLVRKGKIKKAARA 8 >ACHHYP_15409 9 MGLFAPVLAFATVAVAGSSSTTLPTAPASLSTTRSVPLTDRAALIQELAKWKDSKAGK10 YAAANGFLKLSRLESAGDAEAELAAFAETKATVEALNQQYPLARFSTENPFALLTNDEF11 ATWVSGGRDKVQRKVPEASTTQSTTASIAPGTVDWTMSGCVASVRSQGVCGSCFAF12 AAVAAAESAYCLLHDRHLTPFSDQQVLSCGPGNGCMGGWSDQSLAWMASHGVCTG13 ASYPHTNDWNTTAAACIPECKALSMPYSSVASVAGEHELEAAIALQPVAVDISATSPVF14 KNYESGIITGGCNVDFNHVVLGVGYGVAEVPYFKMKNSWGDWWGEGGFVRLQRGV15 GGVGTCGLARHAAYPVVFPMPFNLVTFRGVVISEYYSNLFASAKQGSVNELWTYDAIT16 RHITVGSNHQCLDAYPTGSSYAVHTYSCDAKNDNQKWVIDSANHAIKHAVHPTLCLDV17 DPNQNNKVQVWSCSPGNQNQWVAVSEERVKLWNVNGNFLASDGNLIQFYSPSSPSY18 EWAVSNLDHTWRARSNVGAPDLCLDAYEPWNGGAVHLYTCDSTNGNQKWIYDAKTQ19 QLRHLTHVGFCLDMRTALGDKAHLWTCNTPANSLQKFQYKSLTFPA 20 21
LITERATURE CITED
Archibald, J. M. 2008. The origin and spread of eukaryotic photosynthesis: evolving views in light of genomics. Bot. Mar., 52:95--103. Archibald, J. M. 2009. The puzzle of plastid evolution. Curr. Biol., 19:R81--R88. Armbrust, E. V. 2009. The life of diatoms in the world’s oceans. Nature, 459:185--192. Armbrust, E. V., Berges, J. A., Bowler, C., Green, B. R., Martinez, D., Putnam, N. H., Zhou, S., Allen, A. E., Apt, K. E., Bechner, M., Brzezinski, M. A., Chaal, B. K., Chiovitti, A., Davis, A. K., Demarest, M. S., Detter, J. C., Glavina, T., Goodstein, D., Hadi, M. Z., Hellsten, U., Hildebrand, M., Jenkins, B. D., Jurka, J., Kapitonov, V. V., Kröger, N., Lau, W. W. Y., Lane, T. W., Larimer, F. W., Lippmeier, J. C., Lucas, S., Medina, M., Montsant, A., Obornik, M., Parker, M. S., Palenik, B., Pazour, G. J., Richardson, P. M., Rynearson, T. A., Saito, M. A., Schwartz, D. C., Thamatrakoln, K., Valentin, K., Vardi, A., Wilkerson, F. P. & Rokhsar, D. S. 2004. The genome of the diatom Thalassiosira pseudonana: Ecology, evolution and metabolism. Science, 306:79-86. Baginsky, S., Kleffmann, T., von Zychlinski, A & Gruissem, W. 2005. Analysis of shotgun proteomics and RNA profiling data from Arabidopsis thaliana chloroplasts. J. Prot. Res., 4:637--640. Barbrook, A. C., Howe, C. J. & Purton, S. 2006. Why are plastid genomes retained in non-photosynthetic organisms. Trends Plant Sci., 11:101--108. Baurain, D., Brinkmann, H., Petersen, J., Rodríguez-Ezpeleta, N., Stechmann, A., Demoulin, V., Roger, A. J., Burger, G., Lang, B. F. & Philippe, H. 2010. Phylogenomic evidence for separate acquisition of plastids in cryptophytes, haptophytes and stramenoiles. Mol. Biol. Evol., 27:1698--1709. Beakes, G. W. & Sekimoto, S. 2009. The evolutionary phylogeny of oomycetes - insights gained from studies of holocarpic parasites of algae and invertebrates. In: K. Lamour and S. Kamoun (ed.), Oomycete Genetics and Genomics: Diversity, Interactions, and Research Tools. John Wiley & Sons, Inc., Hoboken, NJ, USA. doi: 10.1002/9780470475898.ch1.
Birch, P. R. J., Rehmany, A. P., Pritchard, L., Kamoun, S. & Beynon, J. L. 2006. Trafficking arms: oomycete effectors enter host plant cells. Trends Microbiol., 14:8--11.
Bittner, L., Halary, S., Payri, C., Cruaud, C., de Reviers, B., Lopez, P. & Bapteste, E. 2010. Some considerations for analyzing biodiversity using integrative metagenomics and gene networks. Biol. Direct, 5:doi:10.1186/1745-6150-5-47.
42
Bodyl, A. & Moszczynski, K. 2006. Did the peridinin plastid evolve through tertiary endosymbiosis? A hypothesis. Eur. J. Phycol., 41:435--448. Bodyl, A. 2005. Do plastid-related characters support the chromalveolate hypothesis? J. Phycol., 41:712--719. Bodyl, A., Stiller, J. W. & Mackiewicz, P. 2009. Chromalveolate plastids: direct descent or multiple endosymbiosis. Trends Ecol. Evol., 3:119--121. Bowler, C., Allen, A. E., Badger, J. H., Grimwood, J., Jabbari, K., Kuo, A., Maheswari, U., Martens, C., Maumus, F., Otillar, R. P., Rayko, E., Salamov, A., Vandepoele, K., Beszteri, B., Gruber, A., Heijde, M., Katinka, M., Mock, T., Valentin, K., Verret, F., Berges, J. A., Brownlee, C., Cadoret, J. P., Chiovitti, A., Choi, C. J., Coesel, S., De Martino, A., Detter, J. C., Durkin, C., Falciatore, A., Fournet, J., Haruta, M., Huysman, M. J., Jenkins, B. D., Jiroutova, K., Jorgensen, R. E., Joubert, Y., Kaplan, A., Kroger, N., Kroth, P. G., La Roche, J., Lindquist, E., Lommer, M., Martin-Jezequel, V., Lopez, P. J., Lucas, S., Mangogna, M., McGinnis, K., Medlin, L. K., Montsant, A., Oudot-Le Secq, M. P., Napoli, C., Obornik, M., Parker, M. S., Petit, J. L., Porcel, B. M., Poulsen, N., Robison, M., Rychlewski, L., Rynearson, T. A., Schmutz, J., Shapiro, H., Siaut, M., Stanley, M., Sussman, M. R., Taylor, A. R., Vardi, A., von Dassow, P., Vyverman, W., Willis, A., Wyrwicz, L. S., Rokhsar, D. S., Weissenbach, J., Armbrust E. V., Green B. R., Van de Peer, Y., Grigoriev, I. V.. 2008. The Phaeodactylum genome reveals the evolutionary history of diatom genomes. Nature, 456:239--244. Burki, F., Shalchian-Tabrizi, K., Minge, M., Skjaevelane, A. Nikolaev, S. I., Jakrobsen, K. S. & Pawlowski, J. 2007. Phylogenomics reshuffles the eukaryotic supergroups. PLoS One, 2:e790. Cavalier-Smith, T. 1999. Principles of protein and lipid targeting in secondary symbiogenesis: euglenoid, dinoflagellate, and sporozoan plastid origins and the eukaryote family tree. J. Eukaryot. Microbiol., 46: 347--366. Cavalier-Smith, T. 2003. Genomic reduction and evolution of novel genetic membranes and protein-targeting machinery in eukaryote-eukaryote chimaeras (meta-algae). Philos. Trans. R. Soc. Lond. B. Biol., 359:109--134. Chan, C. X., Reyes-Prieto, A. & Bhattacharya, D. 2011. Red and green algal origin of diatome membrane transporters: Insights into enviromental adaptation and cell evolution. PloS ONE, 6(12):e29138. doi:10.1371/journal.pone.0029138
Cock, J. M., Sterck, L., Rouze, P., Scornet, D., Allen, A. E., Amoutzias, G., Anthouard, V., Artiguenave, F., Aury, J. M., Badger, J. H., Beszteri, B., Billiau, K., Bonnet, E., Bothwell, J. H., Bowler, C., Boyen, C., Brownlee, C., Carrano, C. J., Charrier, B., Cho, G. Y., Coelho, S. M., Collen, J., Corre, E., Da Silva, C., Delage, L., Delaroque, N., Dittami, S. M., Doulbeau, S., Elias, M., Farnham, G., Gachon, C. M. M., Gschloessl, B., Heesch, S., Jabbari, K. Jubin, C., Kawai, H., Kimura, K., Kloareg, B., Küpper, F. C.,
43
Lang, D., Le Bail, A., Leblanc, C., Lerouge, P., Lohr, M., Lopez, P. J., Martens, C., Maumus, F., Michel, G., Miranda-Saavedra, D., Morales, J., Moreau, H., Motomura, T., Nagasato, Ch., Napoli, C. A., Nelson, D. R., Nyvall-Collén, P., Peters, A. F., Pommier, C., Potin, P., Poulain, J., Quesneville, H., Read, B., Rensing, S. A., Ritter, A., Rousvoal, S., Samanta, M., Samson, G., Schroeder, D. C., Ségurens, B., Strittmatter, M., Tonon, T., Tregear, J. W., Valentin, K., von Dassow, P., Yamagishi, T., Van de Peer, Y., & Wincker, P. 2010. The Ectocarpus genome and the independent evolution of multicellularity in brown algae. Nature, 465:617--621.
De Koning, A. P. & Keeling, P. J. 2004 Nucleus-encoded genes for plastid-targeted proteins in Helicosporidium: functional diversity of a cryptic plastid in a parasitic alga. Eukaryot. Cell, 3:1198--1205.
Delwiche, C. F. 1999. Tracing the thread of plastid diversity through the tapestry of life. Am. Nat., 154:S164--S177.
Dodge, J. D. 1975. A survey of chloroplast ultrastructure in the dinophyceae. Phycologia 14:253-–263. Dong, J., Chen, C. & Chen, Z. 2003. Expression profiles of the Arabidopsis WRKY gene superfamily during plant defense response. Plant Mol. Biol., 51:21--37. Dorrell, R. G. & Smith, A. G. 2011. Do red and green make brown?: perspectives on plastid acquisitions within chromalveolates. Eukaryotic Cell, 10:856--868. Drummond, A. J., Ashton, B., Buxton, S., Cheung, M., Cooper, A., Duran, C., Field, M., Heled, J., Kearse, M., Markowitz, S., Moir, R., Stones-Havas, S., Sturrock, S., Thierer, T. & Wilson, A. 2011. Geneious v5.5. www.geneious.com. Elias, M. & Archibald, J. M. 2009. Sizing up the genomic footprint of endosymbiosis. BioEssays, 31:1273--1279. Emanuelsson, O., Nielsen, H. & von Heijne, G. 1999. ChloroP, a neural network-based method for predicting chloroplast transit peptides and their cleavage sites Prot. Sci., 8:978--984 Foth, B. J. & McFadden, G. I. 2003. The apicoplast: a plastid in Plasmodium falciparum and other Apicomplexan parasites. Int. Rev. Cytol. 224:57--110. Gaulin, E., Madoui, M. A., Bottin, A., Jacquet, C., Mathé, C., Couloux, A., Wincker, P., Dumas, B. 2008. Transcriptome of Aphanomyces euteiches: new oomycete putative pathogenicity factors and metabolic pathways. PLoS ONE, doi:10.1371/journal.pone.0001723 Gibbs, S. 1981a. The chloroplast endoplasmic reticulum: structure, function, and evolutionary significance. Int. Rev. Cytol., 72:49--99.
44
Gibbs, S. 1981b. The chloroplast of some algal groups may have evolved from endosymbiotic eukaryotic algae. Ann. N.Y. Acad. Sci., 361:193--208. Green, B. R. 2011. After the primary endosymbiosis: an update on the chromalveolate hypothesis and the origins of algae with Chl c. Photosynth. Res., 107:103--115. Gruber, A., Vugrinec, S., Hempel, F., Gould, S. B., Maier, U. G. & Kroth, P. G. 2007. Protein argeting into complex diatom plastids: functional characterisation of a specific targeting motif. Plant Mol. Biol. 64:519--530. Gschloessl, B., Guermeur, Y. & Cock, J. M. 2008. HECTAR: A method to predict subcellular targeting in heterokonts. BMC Bioinformatics, doi: 10.1186/1471-2105-9-393. Guillot, M. & Gibbs, S. 1980a. Evidence that the chloroplast and nucleomorph of cryptomonads are remnants of a eukayrotic symbiont. J. Cell Biol., 87:186. Guillot, M. & Gibbs, S. 1980b. The cryptomonad nucleomorph: its ultrastructure and evolutionary significance. J. Phycol., 16:558--568 Guindon, S., Dufayard, J. F., Lefort, V., Anisimova, M., Hordijk, W. & Gascuel, O. 2010. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Sys. Biol., 59:307--321. Hackett, J. D., Yoon, H. S., Li, S., Reyes-Prieto, A., Rümmele, S. E. & Bhatta charya, D. 2007. Phylogenomic analysis supports the monophyly of cryptophytes and haptophytes and the association of rhizaria with chromalveolats. Mol. Biol. Evol. 24:1702--1713. Hackett, J. D., Yoon, H. S., Soares, M. B., Bonaldo, M. F., Casavant, T. L., Sheetz, T. E., Nosenko, T. & Bhattacharya, D. 2004. Migration of the plastid genome to the nucleus in a peridinin dinoflagellates. Curr. Biol., 14:213--218. Harper, J. T., Waanders, E. & Keeling, P. J. 2005. On the monophyly of chromalveolates using a six-protein phylogeny of eukaryotes. Int. J. Syst. Evol. Micr., 55:487--496. Huang, J., Mullapudi, N., Lancto, C. A., Scott, M., Abrahamsen, M. S. & Kissinger, J. C. 2004. Genomic evidence supports past endosymbiosis, intracellular and horizontal gene transfer in Cryptosporidium parvum. Genome Biol., 11:R88. Iida, K. Takishita, K., Ohshima, K. & Inagaki, Y. 2007. Assessing the monophyly of chlorophyll-c containing plastids by multi-gene phylogenies under the unlinked model conditions. Mol. Phylogenet. Evol., 45:227--238.
45
Janouskovec, J., Horak, A., Obornik, M., Lukes, J. & Keeling, P. J. 2010. A common red algal origin of the apicomplexan, dinoflagellates and heterokont plastids. Proc. Natl. Acad. Sci., 107:10949--10954. Jiang, R. H., Tyler, B. M., Whisson, S. C., Hardham, A. R. & Govers, F. 2006. Ancient origin of elicitin gene clusters in Phytophthora genomes. Mol. Biol. Evol., 2:338--351. Kamoun, S. 2006. A catalogue of the effector secretome of plant pathogenic oomycetes. Annu. Rev. Phytopathol., 44:41--60. Keeling, P. J. 2004. Diversity and evolutionary history of plastids and their hosts. Am. J. Bot., 91:1481--1493. Keeling, P. J. 2009. Role of horizontal gene transfer in the evolution of photosynthetic eukaryotes and their plastids. Methods Mol. Biol., 532:501--515. Khan, H., Parks, N., Kozera, C., Curtis, B. A., Parsons, B. J., Bowman, S. & Archibale, J. M. 2007. Plastid genome sequence of the cryptophytes alga, Rhodomonas salina CCMP1319: lateral transfer of putative DNA replication machinery and a test of chromist plastid phylogeny. Mol. Biol. Evol., 24: 1832--1842. Kleffmann, T., Russenberger, D., von Zychlinski, A., Christopher, W., Sjolander, K., Gruissem, W. & Baginsky, S. 2004. The Arabidopsis thaliana chloroplast proteome reveals pathway abundance and novel protein functions. Curr. Biol., 14:354--362. Kleffmann, T., Hirsch-Hoffmann, M. Gruissem, W. & Baginsky, S. 2006. plprot: a comprehensive proteome database for different plastid types. Plant Cell Physiol., 47:432--436. Köhler, S., Delwiche, C. F., Denny, P. W., Tilney, L. G., Webster, P., Wilson, R. J., Palmer, J. D. & Roos, D. S. 1997. A plastid of probable green algal origin in apicomplexan parasites. Science, 275:1485--1489. Kolaczkowski, B. & Thornton, J. W. 2008. A mixed branch length model of heterotachy improves phlogenetic accuracy. Mol. Biol. Evol., 25:1054--1066. Kroth, P. G. 2002. Protein transport into secondary plastids and the evolution of primary and secondary plastids. Int. Rev. Cytol., 221:191--255. Lane, C. E. & Archibald, J. M. 2008. The eukaryotic tree of life: endosymbiosis takes its TOL. Trends Ecol. Evol., 5:268--275. Larkum, A. W. D., Lockhart, P. J. & Howe, C. J. 2007. Shopping for plastids. Trends Plant Sci., 12:189--195. Lee J. J., Leedale G. F. & Bradbury P. (eds) 2000. Illustrated Guide to the Protozoa.
46
2nded., Society of Protozoologists, Allen Press, Lawrence, Kansas. Marchler-Bauer, A., Anderson, J. B., Derbyshire, M. K., DeWeese-Scott, C., Gonzales, N. R., Gwadz, M., Hao, L., He, S., Hurwitz, D. I., Jackson, J. D., Ke, Z., Krylov, D., Lanczycki, C. J., Liebert, C. A., Liu, C., Lu, F., Lu, S., Marchler, G. H., Mullokandov, M., Song, J. S., Thanki, N., Yamashita, R. A., Yin, J. J., Zhang, D. & Bryan, S. H. 2007. CDD: a conserved domain database for interactive domain family analysis. Nucleic Acid Res., 35:D237--240. Moustafa, A., Beszteri, B., Maier, U. G., Bowler, C., Valentin, K. & Bhattacharya, D. 2009. Science, 324:1724--1726. Okamoto, N., Chantangsi, C., Horák, A., Leander, B. S. & Keeling, P. J. 2009. Molecular phylogeny and description of the novel katablepharid Roombia truncate gen. et sp. Nov., and establishment of the hacrobia taxon nov. PLoS ONE. 4:e7080. doi:10.1371/journal.pone.0007080. Pagel, M. & Meade, A. 2008. Modelling heterotachy in phylogenetic inference by reversible-jump Markov chain Monte Carlo. Phil. Trans. R. Soc. B., 363:3955--3964. Parfrey, L. W., Grant, J., Tekle, I. Y., Lasek-Nesselquist, E., Morrison, H. G., Sogin, M. L., Patterson, D. J. & Katz, L. A. 2010. Broadly sampled multigene analyses yield a well-resolved eukaryotic tree of life. Syst. Biol., 59:518--533. Patron, N. J., Inagaki, Y. & Keeling, J. P. 2007 Multiple gene phylogenies support the monophyly of cryptomonads and haptophytes host lineages. Curr. Biol.,17:887-891. Petersen, T. N., Brunak, S., von Heijne, G. & Nielsen, H. 2011. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nature Methods, 8:785--786. Philippe, H., Zhou, Y., Brinkmann, H., Rodrigue, N. & Delsuc, F. 2005. Heterotachy and long-branch attraction in phyloogenetics. BMC Evol. Biol. 5:50. Doi:10.1186/1471-2148-5-50. Ralph, S. A., van Dooren, G. G., Waller, R. F., Crawford, M. J., Fraunholz, J. J., Foth, B. J., Tonkin, C. J., Roos, D. S. & McFadden, G. I. 2004. Metabolic maps and functions of the Plasmodium falciparum apicoplast. Nature Rev. Microbiol., 2:203--216. Reyes-Prieto, A., Moustafa, A. & Bhattacharya, D. 2008. Multiple genes of apparent algal origin suggest ciliates may once have been photosynthetic. Curr. Biol., 13:956--962. Rice, D. W. & Palmer, J. D. 2006. An exceptional gene transfer in plastids: gene replacement by a distant bacterial paralog and evidence that haptophytes and cryptophytes plastids are sisters. BMC Biol., 4:31.
47
Rogers, M. B., Patron, N. J. & Keeling, P. J. 2007. Horizontal transfer of a eukarotic plastid--targeted protein ene to cyanobacteria. BMC Biol., 5:26. Sanchez-Puerta, M. V & Delwiche, C. F. 2008. A hypothesis for plastid evolution in chromalveolates. J. Phycol., 44:1097--1107. Sanchez-Puerta, M. V., Lippmeier, J. C., Apt, K. E. & Delwiche, C. F. 2007. Plastid genes in a non-photosynthetic dinoflagellate. Protist, 158:105--117. Sekimoto, S., Klochkova, T. A., West, J. A., Beakes, G. W. & Honda, D. 2009. Olpidiopsis bostrychiae sp. Nov.: an endoparasitic oomycete that infects Bostrychia and other red algae (Rhodophyta). Phycologia, 48:460--472. Shindo, T., Misas-Villamil, J. C., Hörger A. C., Song, J. & van der Hoorn, R. A. L. 2012. A role in immunity for Arabidopsis cystein protease RD21, the ortholog of the tomato immune protease C14. PloS ONE, 7:e29317. Doi:10.1371/journal.pone.0029317. Slamovits, C. H. & Keeling, P. J. 2008. Plastid-derived genes in the nonphotosynthetic alveolates Oxyrris marinus. Mol. Biol. Evol., 25: 1297--1306. Soll, J. & Schleiff, E. 2004. Protein import into chloroplasts. Nature Rev. Mol. Cell Biol., 5:198--208. Stiller, J. W., Huang, J., Ding, Q., Tian, J. & Goodwillie, C. 2009. Are algal genes in nonphotosynthetic protists evidence of historical plastid endosymbiosis? BMC Genomics, doi:10.1186/1471-2164-10-484 Tatusov, R.L., Natale, D.A., Fedorova, N.D., Jackson, J., Jacobs, A., Krylov, D.M., Mekhedov, S.L., Nikolskaya, A.N., Rao, B.S., Wolf, Y.I., Aravind, L., Lanczycki, C., Masumder, R., Sreekumar, K., Vasudevan, S., Walker, D.R., Tatusova, T.A., Yao, K., Yin, J., Koonin, E.V. 2003. The COG database: an updated version includes
eukaryotes. BMC Bioinformatics. 4:41.
Tyler, B. M., Tripathy, S., Zhang, X., Dehal, P., Jiang, R. H. Y., Aerts, A., Arredondo, F. P., Baxter, L., Bensasson, D., Beynon, J. L., Chapman, J., Damasceno, C. M. B., Dorrance, A. E., Dou, D., Dickerman, A. W., Dubchak, I. L., Garbelotto, M., Gijzen, M., Gordon, S. G., Govers, F., Grunwald, N. J., Huang, W., Ivors, K. L., Jones, R. W., Kamoun, S., Krampis, K., Lamour, K. H., Lee, M. K., McDonald, W. H., Medina, M., Meijer, H. J. G., Nordberg, E. K., Maclean, D. J., Ospina-Giraldo, M. D., Morris, P. F., Phuntumart, V., Putnam, N. H., Rash, S., Rose, J. K. C., Sakihama, Y., Salamov, A. A., Savidor, A., Scheuring, C. F., Smith, B. M., Sobral, B. W. S., Terry, A., Torto-Alalibo, T. A., Win, J., Xu, Z., Zhang, H., Grigoriev, I. V., Rokhsar, D. S., Boore, J. L. 2006. Phytophthora genome sequences uncover evolutionary origins and mechanisms of pathogenesis. Science, 313:1261--1266.
48
Whelan, S. & Goldman, N. 2001. A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol. Bio.l Evol,.18:691--699.
Wilson, R. J. M. 2004. Plastid functions in the Apicomplexa. Protist, 155:11--12. Woehle, C., Dagan, T., Martin, W. F. & Gould, S. B. 2011. Red and problematic green phylogenetic signals among thousands of nuclear genes from the photosynthetic and apicomplexa-related Chromera velia. Genome Biol. Evol., 3:1220--1230. Yoon, H. S., Hackett, J. D., Ciniglia, C., Pinto, G. & Bhattacharya, D. 2004. A molecular timeline for the origin of photosynthetic eukaryotes. Mol. Biol. Evol., 21:809--818. Yoon, H. S., Hackett, J. D., Pinto, G. & Bhattacharya, D. 2002. The single, ancient origin of chromist plastids. Proc. Natl. Acad. Sci. USA, 99:15507--15512.