18
Genome Biology 2003, 4:R55 comment reviews reports deposited research refereed research interactions information Open Access 2003 Omelchenko et al. Volume 4, Issue 9, Article R55 Research Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ Marina V Omelchenko *† , Kira S Makarova , Yuri I Wolf , Igor B Rogozin and Eugene V Koonin Addresses: * Department of Pathology, FE Hebert School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD 20814-4799, USA. National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA. Correspondence: Eugene V Koonin. E-mail: [email protected] © 2003 Omelchenko et al.; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL. Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ The discovery of in situ gene displacement shows that combination of rampant horizontal gene transfer with selection for preservation of operon structure provides for events in prokaryotic evolution that, a priori, seem improbable. These findings also emphasize that not all aspects of operon evolution are selfish, with operon integrity maintained by purifying selection at the organism level Abstract Background: Shuffling and disruption of operons and horizontal gene transfer are major contributions to the new, dynamic view of prokaryotic evolution. Under the 'selfish operon' hypothesis, operons are viewed as mobile genetic entities that are constantly disseminated via horizontal gene transfer, although their retention could be favored by the advantage of coregulation of functionally linked genes. Here we apply comparative genomics and phylogenetic analysis to examine horizontal transfer of entire operons versus displacement of individual genes within operons by horizontally acquired orthologs and independent assembly of the same or similar operons from genes with different phylogenetic affinities. Results: Since a substantial number of operons have been identified experimentally in only a few model bacteria, evolutionarily conserved gene strings were analyzed as surrogates of operons. The phylogenetic affinities within these predicted operons were assessed first by sequence similarity analysis and then by phylogenetic analysis, including statistical tests of tree topology. Numerous cases of apparent horizontal transfer of entire operons were detected. However, it was shown that apparent horizontal transfer of individual genes or arrays of genes within operons is not uncommon either and results in xenologous gene displacement in situ, that is, displacement of an ancestral gene by a horizontally transferred ortholog from a taxonomically distant organism without change of the local gene organization. On rarer occasions, operons might have evolved via independent assembly, in part from horizontally acquired genes. Conclusions: The discovery of in situ gene displacement shows that combination of rampant horizontal gene transfer with selection for preservation of operon structure provides for events in prokaryotic evolution that, a priori, seem improbable. These findings also emphasize that not all aspects of operon evolution are selfish, with operon integrity maintained by purifying selection at the organism level. Published: 29 August 2003 Genome Biology 2003, 4:R55 Received: 22 April 2003 Revised: 26 June 2003 Accepted: 17 July 2003 The electronic version of this article is the complete one and can be found online at http://genomebiology.com/2003/4/9/R55

Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ by marina v omelchenko kira s makarova† yuri i wolf igor b rogozin and eugene v koonin

Embed Size (px)

DESCRIPTION

Abstract Background: Shuffling and disruption of operons and horizontal gene transfer are major contributions to the new, dynamic view of prokaryotic evolution. Under the 'selfish operon' hypothesis, operons are viewed as mobile genetic entities that are constantly disseminated via horizontal gene transfer, although their retention could be favored by the advantage of coregulation of functionally linked genes. Here we apply comparative genomics and phylogenetic analysis to examine horizontal transfer of entire operons versus displacement of individual genes within operons by horizontally acquired orthologs and independent assembly of the same or similar operons from genes with different phylogenetic affinities. Results: Since a substantial number of operons have been identified experimentally in only a few model bacteria, evolutionarily conserved gene strings were analyzed as surrogates of operons. The phylogenetic affinities within these predicted operons were assessed first by sequence similarity analysis and then by phylogenetic analysis, including statistical tests of tree topology. Numerous cases of apparent horizontal transfer of entire operons were detected. However, it was shown that apparent horizontal transfer of individual genes or arrays of genes within operons is not uncommon either and results in xenologous gene displacement in situ, that is, displacement of an ancestral gene by a horizontally transferred ortholog from a taxonomically distant organism without change of the local gene organization. On rarer occasions, operons might have evolved via independent assembly, in part from horizontally acquired genes. Conclusions: The discovery of in situ gene displacement shows that combination of rampant horizontal gene transfer with selection for preservation of operon structure provides for events in prokaryotic evolution that, a priori, seem improbable. These findings also emphasize that not all aspects of operon evolution are selfish, with operon integrity maintained by purifying selection at the organism level.

Citation preview

Page 1: Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ by marina v omelchenko kira s makarova† yuri i wolf igor b rogozin and eugene v koonin

com

ment

reviews

reports

deposited research

refereed researchinteractio

nsinfo

rmatio

n

Open Access2003Omelchenkoet al.Volume 4, Issue 9, Article R55ResearchEvolution of mosaic operons by horizontal gene transfer and gene displacement in situMarina V Omelchenko*†, Kira S Makarova†, Yuri I Wolf†, Igor B Rogozin† and Eugene V Koonin†

Addresses: *Department of Pathology, FE Hebert School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD 20814-4799, USA. †National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.

Correspondence: Eugene V Koonin. E-mail: [email protected]

© 2003 Omelchenko et al.; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.Evolution of mosaic operons by horizontal gene transfer and gene displacement in situThe discovery of in situ gene displacement shows that combination of rampant horizontal gene transfer with selection for preservation of operon structure provides for events in prokaryotic evolution that, a priori, seem improbable. These findings also emphasize that not all aspects of operon evolution are selfish, with operon integrity maintained by purifying selection at the organism level

Abstract

Background: Shuffling and disruption of operons and horizontal gene transfer are majorcontributions to the new, dynamic view of prokaryotic evolution. Under the 'selfish operon'hypothesis, operons are viewed as mobile genetic entities that are constantly disseminated viahorizontal gene transfer, although their retention could be favored by the advantage of coregulationof functionally linked genes. Here we apply comparative genomics and phylogenetic analysis toexamine horizontal transfer of entire operons versus displacement of individual genes withinoperons by horizontally acquired orthologs and independent assembly of the same or similaroperons from genes with different phylogenetic affinities.

Results: Since a substantial number of operons have been identified experimentally in only a fewmodel bacteria, evolutionarily conserved gene strings were analyzed as surrogates of operons. Thephylogenetic affinities within these predicted operons were assessed first by sequence similarityanalysis and then by phylogenetic analysis, including statistical tests of tree topology. Numerouscases of apparent horizontal transfer of entire operons were detected. However, it was shown thatapparent horizontal transfer of individual genes or arrays of genes within operons is not uncommoneither and results in xenologous gene displacement in situ, that is, displacement of an ancestral geneby a horizontally transferred ortholog from a taxonomically distant organism without change of thelocal gene organization. On rarer occasions, operons might have evolved via independent assembly,in part from horizontally acquired genes.

Conclusions: The discovery of in situ gene displacement shows that combination of rampanthorizontal gene transfer with selection for preservation of operon structure provides for events inprokaryotic evolution that, a priori, seem improbable. These findings also emphasize that not allaspects of operon evolution are selfish, with operon integrity maintained by purifying selection atthe organism level.

Published: 29 August 2003

Genome Biology 2003, 4:R55

Received: 22 April 2003Revised: 26 June 2003Accepted: 17 July 2003

The electronic version of this article is the complete one and can be found online at http://genomebiology.com/2003/4/9/R55

Genome Biology 2003, 4:R55

Page 2: Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ by marina v omelchenko kira s makarova† yuri i wolf igor b rogozin and eugene v koonin

R55.2 Genome Biology 2003, Volume 4, Issue 9, Article R55 Omelchenko et al. http://genomebiology.com/2003/4/9/R55

BackgroundOperons, clusters of co-transcribed genes that often encodefunctionally linked proteins, are the principal form of geneorganization and regulation in prokaryotes [1,2]. Compara-tive analysis of bacterial and archaeal genomes has shownthat only a few operons are conserved across large evolution-ary distances. In general, gene order in prokaryotes is poorlyconserved and prone to numerous rearrangements [3-6]. Adetailed analysis of gene order conservation has shown thatonly 5-25% of the genes in bacterial and archaeal genomesbelongs to gene strings (probable operons) shared by at leasttwo distantly related species [7]. The presence of identical orsimilarly organized operons and suboperons in phylogeneti-cally distant bacterial or archaeal lineages may result fromthree distinct evolutionary processes. Firstly, inheritancefrom the respective common ancestor - the core of the ribos-omal protein superoperon is a case in point, but such conser-vation of operon organization is relatively rare; secondly,independent origin of identical operons or suboperons in dif-ferent lineages; and thirdly, emergence of operons in a singlelineage with subsequent dissemination by horizontal trans-fer. The potential central role of horizontal transfer in theevolution of operon organization of prokaryotic genomes isembodied in the 'selfish operon model' (SOM) [8-10]. Thismodel posits that "the physical proximity of genes in anoperon provides no selective benefit to the individual organ-ism but does enhance the fitness of the gene cluster itself, asclusters can be efficiently inherited horizontally as well as ver-tically" [11]. Under SOM, operons are conceptually analogousto integrating viruses (phages), transposons and other mobilegenetic elements, although coregulation of the genes in anoperon could be an important selective factor that favorsretention of operons during evolution.

Horizontal gene transfer (HGT) events have been classifiedinto distinct categories of acquisition of new genes, acquisi-tion of paralogs of existing genes and xenologous gene dis-placement whereby a gene is displaced by a horizontallytransferred ortholog from another lineage (xenolog [12]).Each of these types of horizontal transfer is common amongprokaryotes, but their relative contributions differ in differentlineages [13]. Comparative-genomic analyses by many groupshave suggested that, on the whole, horizontal gene transferhad substantial effects, albeit uneven in different lineages, onthe gene content of bacterial and archaeal genomes [13-19].However, in spite of the considerable popularity of the selfishoperon theory, we are unaware of systematic studies of hori-zontal gene transfer events at the level of operons. In part,this is likely to have been caused by the scarcity of experimen-tal data on operon organization in any prokaryote other thanEscherichia coli.

Recent phylogenetic analyses of ribosomal proteins revealedseveral instances of apparent xenologous gene displacementwithin a conserved operon, in which other genes have notbeen horizontally transferred; in other words, these operons

appear to represent an evolutionary mosaic [20-22]. Anotherstudy demonstrated a complicated mosaic organization of theleukotoxin operon in bacteria of the genus Mannheimia (Pas-teurella); the observed evolutionary pattern had to beexplained through multiple gene transfer events, which led tothe hypothesis that, in this case, frequent gene displacementconferred selective advantage onto the bacterium by main-taining antigenic variation [23]. In earlier studies, evolutionof operons from gene blocks with distinct evolutionary fateshas been considered for rfb operons coding for lipopolysac-charide biosynthesis in enterobacteria [24].

To assess the role of horizontal gene transfer in the evolutionof operons systematically, we undertook phylogenetic analy-sis of members of highly conserved gene neighborhoods thatare predicted to constitute operons [25]. We focused prima-rily on mosaic operons in which one or more of the genesapparently have been transferred from distantly related spe-cies such that the phylogeny of the transferred genes is obvi-ously incongruent with the phylogeny of the remaining genesin the respective operons.

Results and discussionIdentification of horizontal gene transferExperimental data on operons in organisms other than E. coliand, to a lesser extent, B. subtilis are scarce. Therefore weused conserved gene pairs and connected gene neighbor-hoods associated with them as an approximation of operonorganization of genes in other prokaryotic genomes. Severalstudies have suggested strongly that all gene pairs that areconserved in multiple genomes belong to the same operon[7,25,26]. Here we used an extremely conservative threshold(conservation of a gene pair in 10 genomes) to ensure thatonly genuine operons were analyzed. BLASTP searches forpotential horizontal gene transfer identified 729 candidategenes (9% of all genes comprising conserved neighborhoodsin 41 analyzed genomes), that is, genes whose encoded pro-tein sequences were more similar to homologs from phyloge-netically distant taxa than to those from the reference taxon(it might be worth noting that, throughout this analysis, wetreated genes as atomic units and did not consider the rela-tively unlikely possibility of HGT for portions of genes). Phy-logenetic analysis of these genes and their neighbors revealeddifferent types of evolutionary events, some of which involvewhole operons, whereas others seem to reflect operonmosaicity.

Probable horizontal transfer of whole operons or large por-tions of operons, when phylogenetic trees for all genes in apredicted operon had the same topology (which, however,was incompatible with the species tree) was identified in 35neighborhoods - approximately one third of all analyzedneighborhoods. These events were classified into three cate-gories: acquisition of a new (for the given lineage) operon,paralogous operon acquisition and xenologous operon

Genome Biology 2003, 4:R55

Page 3: Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ by marina v omelchenko kira s makarova† yuri i wolf igor b rogozin and eugene v koonin

http://genomebiology.com/2003/4/9/R55 Genome Biology 2003, Volume 4, Issue 9, Article R55 Omelchenko et al. R55.3

com

ment

reviews

reports

refereed researchdepo

sited researchinteractio

nsinfo

rmatio

n

displacement [13]. Examples of all these classes of apparentoperon transfer events are given in Table 1. These 35 neigh-borhoods generally represented functional classes of genesknown to be prone to HGT: transporters, generalmetabolism-related genes and signal transduction systems[13,15,17]. This seems to be a relatively low level of horizontal

transfer in view of the purported selfish behavior of operons[9,10]. However, the strict threshold, described above, on thedetection of conserved gene pairs undoubtedly led to manyhorizontally transferred operons being missed. Thus, thepresent analysis gives a conservative low bound of operontransfer.

Table 1

Examples of horizontally transferred operons

Operon Recipient organism and correspondent genes

Probable source Other probable recipients Comment

Operon acquisition

Pyruvate:ferredoxin oxidoreductase

Thermotoga maritima TM0015-TM0018

Archaea Aae, Hpy, Bha/Sau Apparently, the related operon for 2-oxoisovalerate oxidoreductase (TM1758-TM1759) was also transferred from archaea

Sulfate/molybdate transport Bacillus halodurans BH3128-BH3130

Gram-negative bacteria - No other such operons in Bacillus-Clostridium group members

Putative effector of murein hydrolase

Pyrococcus horikoshii PH1801-PH1802

Bacteria Pab, Mac

Allophanate hydrolase subunits

Pyrococcus horikoshii PH0987-PH0988

Bacteria Pab

Paralogous operon acquisition

Dipeptide transporter Vibrio cholerae VC0620-VC0616

Thermotoga/Archaea Tma It has several another bacterial operons including VC1091-VC1095

Ribonucleotide reductase alpha and beta subunit

Halobacterium sp. VNG2384G VNG2383G

Bacteria - Additional to "archaeal:" Ribonucleotide reductase alpha subunit VNG1644G, beta subunit is apparently lost

Aromatic amino-acid biosynthesis

Halobacterium sp. VNG0384G VNG0386G

Bacteria - Paralogs of this pair are VNG1646G-VNG1647G

Xenologous operon displacement

Histidine biosynthesis suboperon

Pseudomonas aeruginosa PA3151-PA3152

Epsilon-Proteobacteria -

Panthothenate synthesis Campylobacter jejuni Cj0297c-Cj0298c

Gram-positive bacteria -

DNA repair SbcDC Vibrio cholerae VCA0520-VCA0521

Gram-positive bacteria -

DNA gyrase A and B Halobacterium sp. VNG0887G-VNG0889G

Bacteria Hbs, Tac, Tvo, Afu,

Dipeptide transporter Streptococcus pyogenes SPy2000-SPy2004

Gamma-Proteobacteria -

Glutamate synthase complex

Thermotoga maritima TM0394-TM0398

Archaea - There is another homolog for gene TM0397 of possible archaeal origin

NADH:ubiquinone oxidoreductase

Halobacterium sp. VNG0635G-VNG0637G

Bacteria -

Phosphate transporter Methanothermobacter thermoautotrophicum MTH1727-MTH1734

Bacteria -

Genome Biology 2003, 4:R55

Page 4: Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ by marina v omelchenko kira s makarova† yuri i wolf igor b rogozin and eugene v koonin

R55.4 Genome Biology 2003, Volume 4, Issue 9, Article R55 Omelchenko et al. http://genomebiology.com/2003/4/9/R55

In addition, 19 predicted operons with different phylogeneticaffinities of the constituent genes, that is, apparent mosaicoperons, were identified (Table 2). Again, this is definitely alow bound - not only because of the high threshold set for theidentification of conserved gene pairs, but also because thisnumber includes only cases that were clearly resolved by phy-logenetic tree analysis. In addition, we detected many uncer-tain cases where the different phylogenetic affinities of geneswithin an operon were not strongly supported (data notshown); at least some of these are probably also mosaicoperons.

Below we describe in greater detail several case studies ofputative mosaic operons; in each of these cases, in addition tothe basic set of 41 species, we included in the analysis theapparent orthologs of the respective proteins from allprokaryotic species in which they were detected, in order tocontrol for possible effects of taxon sampling. We found that,although the details of tree topology inevitably depended onthe set of species analyzed, the conclusions regarding HGTwere not affected by the inclusion of additional species.

Case studies of mosaic operonsRibosomal protein L29 geneIn the previous study that prompted this work, we analyzedthe phylogeny of several ribosomal proteins and found sev-eral cases of apparent horizontal transfer resulting in mosaicoperon organization [20]. Horizontal transfer "in the heart ofthe ribosome" also has been independently described by oth-ers [21,22]. Here we report another case of a ribosomal pro-tein operon with apparent in situ gene displacement (that is,displacement without change of the local gene arrangement)via HGT. Figure 1a shows the highly conserved gene arrange-ment around the gene for the large subunit protein L29. Thephylogenetic trees for the flanking L16 and S17 genes showedlargely congruent topologies without any indications of HGT(Figure 1b,d). In contrast in the L29 tree, unexpected cluster-ing is seen for Aquifex aeolicus and both Rickettsia: theAquifex branch is within the archaeal cluster, whereas theRickettsia group is with Chlamydia, rather than with the restof alpha-proteobacteria: the taxon where Rickettsia belong(Figure 1c). In situ displacement is the most likely mechanismbehind this observation given that the structure of this operonis conserved in the majority of bacteria. The nature of theselective advantages conferred by this gene substitution isunclear, but the apparent sources of the transferred genessuggest that the displacements indeed might be adaptive.Aquifex apparently acquired the L29 gene from archaea,which could be related to the adaptation to the hyperthermalconditions, whereas Rickettsia probably captured the genefrom other parasitic bacteria, such as Chlamydia. However,these observations also allow a non-adaptationist interpreta-tion, under which the apparent source of acquired genes sim-ply reflects the increased likelihood of gene exchange betweenthe respective organisms due to co-habitation, with chancefixation of some of the transferred genes.

The ruvB gene of MycoplasmaThe genes for Holliday junction resolvase subunits RuvA andRuvB form an operon that is conserved in most of thesequenced bacterial genomes (Figure 2a). In the phylogenetictrees for RuvA and RuvB, the branch that includesUreaplasma and Mycoplasma occupies drastically differentpositions. In contrast to RuvA, which belongs to the Gram-positive clade as expected (Figure 2b), mycoplasmal RuvBclusters with the epsilon-proteobacteria (Helicobacter andCampylobacter) and the mycoplasma-epsilon-proteobacte-ria clade further joins alpha-proteobacteria (Figure 2c). Thisclustering is strongly supported by bootstrap analysis andwas shown to be robust using statistical tests of tree topology(Table 3). Thus, the ruvB gene seems to have undergonexenologous displacement in situ after the divergence of themycoplasmal branch from the rest of Gram-positive bacteria.Notably, the gene exchange seems to have occurred betweenphylogenetically distant parasitic bacteria.

Undecaprenyl pyrophosphate synthase gene in the lipid biosynthesis operon of RickettsiaIn Rickettsia, the undecaprenyl pyrophosphate synthase gene(uppS), which belongs to a highly conserved doublet of lipidbiosynthesis genes embedded in functionally diverse operons(Figure 3a), clusters with an unexpected assemblage of bacte-rial orthologs, including those from the spirocheteTreponema pallidum and Fusobacterium nucleatum, but notwith the 'native' taxon, alpha-proteobacteria (Figure 3b,c).Statistical testing of the tree topology showed that clusteringof rickettsial uppS with those from other alpha-proteobacte-ria is highly unlikely (Table 3). The apparent in situ gene dis-placement of the uppS gene in Rickettsia was accompanied bya breakdown of the operon into three fragments (Figure 3a).The topology of the uppS tree suggests the possibility of mul-tiple HGT events, although only the rickettsial genomes showevidence of gene displacement in situ. The emergence of genedisplacement in bacterial parasites is noted here again.

NADH:ubiquinone oxidoreductase subunits in Halobacterium sp.Gene organization in the NADH:ubiquinone oxidoreductaseoperon is highly conserved in all sequenced archaeal genomesand those of several groups of bacteria (Figure 4a). The nuoIgene of Halobacterium sp. shows an unexpected phylogeneticaffinity with proteobacteria (Figure 4c), whereas the neigh-boring genes have the regular archaeal affinities (Figure4b,d). The unusual phylogeny of halobacterial NuoI, whichwas strongly supported by statistical tests (Table 3), suggestsin situ displacement by a proteobacterial gene. Notably, allthree NADH:ubiquinone oxidoreductase subunits of thecyanobacteria unexpectedly grouped within the archaealclusters of the respective trees (Figure 4b-d). These observa-tions point to a complex history of HGT for the genes encod-ing all subunits of NADH:ubiquinone oxidoreductase.

Genome Biology 2003, 4:R55

Page 5: Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ by marina v omelchenko kira s makarova† yuri i wolf igor b rogozin and eugene v koonin

http://genomebiology.com/2003/4/9/R55 Genome Biology 2003, Volume 4, Issue 9, Article R55 Omelchenko et al. R55.5

com

ment

reviews

reports

refereed researchdepo

sited researchinteractio

nsinfo

rmatio

n

Table 2

Examples of probable mosaic operons

Species Predicted operon General operon function

Horizontally acquired genes

Probable source of horizontally acquired genes

Functions of horizontally acquired genes

Cluster 1*

Rickettsia prowazekii Rickettsia conorii

RP633-661, RC0980-1008

Ribosomal operon RP651 RC0998 Chlamydia L29

Aquifex aeolicus Aq001-021 Ribosomal operon Aq018a Archaea L29

Cluster 2

Rickettsia prowazekii Rickettsia conorii

RP800-804, RC1234-1238

F0F1-type ATPase RP804 RC1238 Gram-positive bacteria

Delta subunit

Ureaplasma urealyticum UU128-138 F0F1-type ATPase UU128, UU132_1, UU133, UU134

Gram-negative bacteria

Epsilon subunit, alpha subunit, delta subunit, delta subunit

Mycobacterium leprae ML1139-1146 F0F1-type ATPase ML1139 Gram-negative bacteria

A chain protein

Cluster 3

Rickettsia prowazekii Rickettsia conorii

RP134-139, RC175-180

Ribosomal proteins, transcription antiterminator, SecE

RP134 RC175 Gram-positive bacteria

Preprotein translocase subunit SecE

Cluster 5

Aquifex aeolicus Aq1968_1_2 two domains

Histidine biosynthesis Gram-negative bacteria

Phosphoribosyl-AMP cyclohydrolase

Cluster 8

Methanococcus jannaschii

MJ1037-1038 Tryptophan biosynthesis

MJ1037 Bacteria Tryptophan synthase beta chain

Methanobacterium thermoautotrophicum

MTH1655-1661 Tryptophan biosynthesis

MTH1660 Gram-negative bacteria

Tryptophan synthase alpha chain

Halobacterium sp. VNG0305-0309 Tryptophan biosynthesis

VNG0307G Bacteria Tryptophan synthase beta chain

Bacillus subtilis Bacillus halodurans

PabB-folK BH0090-0095

Tryptophan biosynthesis

PabB, BH0090 Gram-negative bacteria

Anthranilate/para-aminobenzoate synthases component I

Cluster 9

Halobacterium sp. VNG0635G-0647G NADH:ubiquinone oxidoreductase

VNG0640G Gram-negative bacteria

NADH dehydrogenase-like protein

Cluster 18

Rickettsia prowazekii Rickettsia conorii

RP423-425, RC0588-0590

Lipid metabolism RP425, RC0590 Spirochetes Undecaprenyl pyrophosphate synthase

Cluster 27

Halobacterium sp. VNG1306G-1310G Succinate dehydrogenase/fumarate reductase

VNG1310G Actinobacteria Succinate dehydrogenase subunit C

Genome Biology 2003, 4:R55

Page 6: Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ by marina v omelchenko kira s makarova† yuri i wolf igor b rogozin and eugene v koonin

R55.6 Genome Biology 2003, Volume 4, Issue 9, Article R55 Omelchenko et al. http://genomebiology.com/2003/4/9/R55

Cluster 29

Mycoplasma genitalium Mycoplasma pneumoniae

MG461-466 MPN677-682

Housekeeping MG466 MPN682 Gram-negative bacteria

Ribosomal protein L34

Cluster 34

Thermotoga maritima TM0548-0556 Leucine/isoleucine biosynthesis

TM0552 TM0555 TM0554

2-Isopropylmalate synthase 3-Isopropylmalate dehydratase, small subunit 3-Isopropylmalate dehydratase, large subunit

Pyrococcus abyssi PAB888-895 PAB0890 PAB0893 Bacteria 2-Isopropylmalate synthase (LeuA-1) 3-Isopropylmalate dehydrogenase (LeuB)

Clostridium acetobutylicum

CAC3169-3174 Leucine/isoleucine biosynthesis

CAC3172 CAC3173 CAC3174 Archaea

3-Isopropylmalate dehydratase, small subunit 3-Isopropylmalate dehydratase, large subunit 2-Isopropylmalate synthase

Cluster 41

Thermotoga maritima TM1243-1251 Nucleotide metabolism

TM1243 Archaea Phosphoribosylaminoimidazole-succinocarboxamide synthase

Cluster 42

Lactococcus lactis L0104-0108 Arginine biosynthesis L0107 Gram-negative bacteria

Acetylglutamate kinase

Thermotoga maritima TM1780-1785 Arginine biosynthesis TM1784

Archaea Acetylglutamate kinase

Cluster 48

Borrelia burgdorferi BB0054-0061 Carbohydrate metabolism (glycolysis, gluconeogenesis)

BB0057 Gram-positive bacteria

Glyceraldehyde-3-phosphate dehydrogenase

Cluster 54

Thermotoga maritima TM1780-1785 Arginine biosynthesis TM1780 Gram-negative bacteria

Argininosuccinate synthase

Cluster 63

Mycoplasma pneumoniae Mycoplasma genitalium

MPN573-574 MG391-392

Molecular chaperones MPN574 MG393 Gram-negative bacteria

Heat shock protein (groES)

Cluster 82

Mycoplasma pneumoniae, Mycoplasma genitalium

MPN535-536 MG358-359

DNA replication, recombination and repair

MPN536 MG359 Gram-negative bacteria

Holliday junction resolvasome helicase subunit

Table 2 (Continued)

Examples of probable mosaic operons

Genome Biology 2003, 4:R55

Page 7: Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ by marina v omelchenko kira s makarova† yuri i wolf igor b rogozin and eugene v koonin

http://genomebiology.com/2003/4/9/R55 Genome Biology 2003, Volume 4, Issue 9, Article R55 Omelchenko et al. R55.7

com

ment

reviews

reports

refereed researchdepo

sited researchinteractio

nsinfo

rmatio

n

Lipopolysaccharide biosynthesis operon in Methanothermobacter thermoautotrophicus and Deinococcus radioduransThe genes of the lipopolysaccharide biosynthesis (rfbABCD)operon appear to have been extensively and independentlyshuffled in many prokaryotic genomes and might have under-gone multiple horizontal transfers. This conclusion is sup-ported both by examination of the operon organization(Figure 5a) and by phylogenetic tree analysis (Figure 5b-e).The trees showed a clear affinity between the rfbA, rfbB, rfbCgenes of Methanothermobacter thermoautotrophicum andClostridium acetobutylicum (Figure 5b-d), with Fusobacte-rium nucleatum and Listeria monocytogenes joining thecluster in the case of rfbB (Figure 5b), whereas M. thermoau-totrophicum RfbD clustered with its archaeal orthologs asexpected (Figure 5e). The genes of the rfbABCD operon inMethanothermobacter are shuffled compared to the proba-ble ancestral order, which is found in many bacteria and C.acetobutylicum also shows a rearrangement (Figure 5a). Onelikely scenario in this case is that M. thermoautotrophicumacquired the rfbABCD operon with the typical gene orderfrom a bacterium of the clostridial lineage, which was fol-lowed by displacement of three resident genes and loss of oneof the invading genes, accompanied by operon

rearrangement. An alternative scenario is that the rearrange-ment occurred in the source bacterium of the clostridialgroup and Methanothermobacter acquired only the rfbACBportion, which might have inserted head-to-tail downstreamof the original operon, followed by elimination of the residentrfbABC (Figure 5a).

Another interesting case of mosaic structure of the sameoperon is seen in Deinococcus radiodurans (Figure 5a). Dei-nococcus RfbA shows clear affinity with proteobacteria (Fig-ure 5d), whereas RfbD is of archaeal descent (Figure 5e), withRELL analysis revealing no competing topologies (Table 3).The remaining two genes of this operon in Deinococcus, rfbB(DRA0041) and rfbC (DRA0043), have uncertain phyloge-netic affinities (Figure 5b,5c). Thus, as in the case of M. ther-moautotrophicus, this operon in Deinococcus was apparentlyformed through at least two events of xenologous gene dis-placement in situ and gene shuffling.

Leucine/isoleucine biosynthesis operonPerhaps the most prominent case of mosaic operon organiza-tion is the leucine/isoleucine biosynthesis operon of severalbacteria and archaea, particularly Thermotoga maritima.

Ureaplasma urealyticum UU448-449 DNA replication, recombination and repair

UU448 Gram-negative bacteria

Holliday junction resolvasome helicase subunit

Cluster 86

Halobacterium sp. VNG6305CC-6306C Tetrahydrobiopterin biosynthesis

VNG6305C Gram-negative bacteria

Organic radical activating enzyme

Cluster 87

Halobacterium sp. VNG0582C-0586C Energy production and conversion

VNG0582, VNG0583G

Bacteria Cytochrome b subunit of the bc complex Cytochrome b subunit of the bc complex

Cluster 103

Archaeoglobus fulgidus AF0321-0325 Lipopolysaccharide biosynthesis

AF0323b Bacteria dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes

Deinococcus radiodurans DRA0037-DRA0044 Lipopolysaccharide biosynthesis

DRA0044 Archaea dTDP-4-dehydrorhamnose epimerase

Methanothermobacter thermoautotrophicus

MTH1789-1792 Lipopolysaccharide biosynthesis

MTH1789, MTH1790, MTH1791

Gram-positive bacteria Bacteria Bacteria

dTDP-D-glucose 4,6-dehydratase dTDP-4-dehydrorhamnose 3,5-epimerase dTDP-glucose pyrophosphorylase

*The numbering of gene clusters is from the previously published analysis of gene neighborhoods in prokaryotic genomes [25].

Table 2 (Continued)

Examples of probable mosaic operons

Genome Biology 2003, 4:R55

Page 8: Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ by marina v omelchenko kira s makarova† yuri i wolf igor b rogozin and eugene v koonin

R55.8 Genome Biology 2003, Volume 4, Issue 9, Article R55 Omelchenko et al. http://genomebiology.com/2003/4/9/R55

Figure 1 (see legend on next page)

10

MTH1119 MthAF1339 Afu

MJ0543 MjaSSO0294 Sso

APE0449 ApePAB1444 Pab

PH0633 PhoVNG0099G Hsp

TVN0539 TvoTa1057m Tac

UU238 UurMPN172 Mpn

MG158 MgeCgl0503 Cgl

SCO4709 ScoRv0708 Mtu

ML1856 Mleaq 018 Aae

SA2040 SauOB0126 Oih

BS rplP BsuBH0141 Bha

lin2774 LinL0412 Lla

SP0216 SpnSPy0057 Spy

TM1493 TmaCPn0640 CpnCT521 Ctr

DR0318 DraFN1638 Fnu

CAC3126 CacCT2182 Cte

sll1805 Sspall4208 Nsp

tlr0088 TelBB0485 Bbu

TP0196 TpaHP1312 HpyCj1700c Cje

SMc01302 Smemlr0301 Mlo

BMEI0764 BmeCC1255 Ccr

RP652 RprRC0999 Rco

VC2589 VchHI0784 Hin

PM1408 PmuBU517 Bsp

rplP EcoYPO0217 Ype

NMB0149 NmeRSc3012 Rso

NE0408 NeuXF1159 Xfa

PA4256 Pae

RC0999 RC0998 RC0997

Aq018 Aq018a Aq020

TM1493 TM1492 TM1491

msr0301 mcr0303 mcr0304

CC1255 CC1256 CC1257

Tma

Aae

Rco

Ccr

Mlo

COG0197

L16

COG0255

L29

COG0186S17

RP652 RP651 RP650Rpr

SP0216 SP0217 SP0218Spn

rplP rpmC rpsQBsu

COG0197

10

FN1637 FnuSSO6397 SspMTH9 Mth

MJ0462 MjaTVN0331 Tvo

Ta1264a TacVNG1698G Hsp

AF1918 Afuaq 018a Aae

APE0362a ApeCT2181 Cte

PAB8082 PabPHs048 Pho

BB0486 BbuTP0197 Tpa

SCO4710 ScoCgl0504 Cgl

ML1855 MleRv0709 Mtu

BH0142 BhaOB0127 Oih

BS rpmC BsuSA2039 Sau

lin2773 LinTM1492 Tma

CAC3125 CacUU239 Uur

MPN173 MpnMG159 MgeCj1699c Cje

HP1311 HpyDR0319 Dra

tpl0089 Telssl3436 Ssp

asl4207 NspCT520 Ctr

CPn0639 CpnRP651 RprRC0998 Rco

SP0217 SpnL0423 Lla

SPy0059 SpyCC1256 Ccr

msr0303 MloSMc01301 Sme

BMEI0765 BmeNE0409 Neu

XF1160 XfaVC2588 Vch

YPO0218 YperpmC Eco

BU516 BspHI0785 HinPM1407 Pmu

PA4255 PaeNMB0150 Nme

RSc3011 Rso

1

2

3

4

COG0255 COG0186

10

CT519 CtrCPn0638Cpn

ssl3437 Sspasl4206 Nsp

tlr0090 TelTa1262 Tac

TVN0334 TvoVNG1700G Hsp

MTH12 MthAF1916 Afu

SSO0709 SspAPE0360 Ape

MJ0465 MjaPH1770 PhoPAB2127 Pab

aq 020 AaeTM1491 Tma

NE0410 NeuRSc3010 Rso

DR0320 DraHP1310 Hpy

Cj1698c CjeCC1257 Ccr

BMEI0766 Bmemsr0304 Mlo

SMc01300 SmeRP650 RprRC0997 Rco

NMB0151 NmeXF1161 Xfa

PA4254 PaeCT2180 Cte

VC2587 VchYPO0219 Ype

HI0786 HinPM1406 Pmu

rpsQ EcoBU515 Bsp

TP0198 TpaUU240 Uur

MPN174 MpnMG160 Mge

CAC3124 CacFN1636 Fnu

BB0487 BbuCgl0505 Cgl

SCO4711 ScoML1854 Mle

Rv0710 MtuSPy0060 SpySP0218 Spn

L0394 LlaBH0143 Bha

lin2772 LinBS rpsQ Bsu

SA2038 SauOB0128 Oih

(a) (b)

(c) (d)

Genome Biology 2003, 4:R55

Page 9: Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ by marina v omelchenko kira s makarova† yuri i wolf igor b rogozin and eugene v koonin

http://genomebiology.com/2003/4/9/R55 Genome Biology 2003, Volume 4, Issue 9, Article R55 Omelchenko et al. R55.9

com

ment

reviews

reports

refereed researchdepo

sited researchinteractio

nsinfo

rmatio

n

This is the only known branched chain amino acid biosynthe-sis operon, and it is partly conserved in a wide range of bacte-ria (Figure 6a). Following initial indications from the analysisof taxon-specific BLAST hits, we constructed phylogenetictrees for each of the genes of this operon. Unlike other bacte-ria, Thermotoga has two leuA paralogs, which are adjacent inthe operon. The proteins encoded by these paralogous genesshow clearly distinct phylogenetic affinities: TM0552 belongsto a distinct clade within the archaeal domain, whereasTM0553 is part of a Gram-positive bacterial cluster (Figure6b). This phylogenetic mosaic in Thermotoga extends fur-ther, with LeuB (TM0556) clustering with proteobacterialorthologs (Figure 6c), and LeuC (TM0554) and LeuD(TM0555) with archaeal orthologs (Figure 6d,e). All theseaffinities were strongly supported by two versions of boot-strap analysis (Table 3). The genes encoding LeuA, LeuC, andLeuD from Thermotoga, Clostridium, Aquifex and bothPyrococcus abyssi and P. furiosus belong to a well-definedclade, which also includes a medley of alpha-proteobacteriaand cyanobacteria, within the archaeal domain in the respec-tive trees (Figure 6b-e). Thus, this sub-operon apparently hasbeen relatively recently horizontally spread among theseorganisms. Pyrococcus abyssi and P. furiosus probablyacquired these genes after the divergence from the commonancestor with P. horikoshii because the latter has only the typ-ical archaeal operon (Figure 6a).

Given the apparent propensity of Thermotoga (and otherhyperthermophilic bacteria) for acquisition of archaeal genesvia HGT, it seems most likely that the archaeal version of theleuACD suboperon originally entered the bacterial domainvia Thermotoga or a related thermophilic bacterium.Formally, in Thermotoga these events could be classified as acombination of paralogous (sub)operon acquisition(TM0554-TM0555 in addition to another paralogousarchaeal gene pair TM0291-TM0292) and xenologous genedisplacements (genes TM0553, TM0556). In Clostridium,xenologous operon displacement seems to have occurredbecause the ancestral operon of the Gram-positive typeapparently had been lost. The subsequent evolution of thisoperon in the four organisms proceeded along differentpaths. Aquifex has lost the operon structure even for the twosubunits of 3-isopropylmalate dehydratase (LeuB, LeuD).Different genes in the operons of P. abyssi and C.

acetobutylicum have been translocated and several genesprobably have been independently accrued (Figure 6a). Inboth P. abyssi and Thermotoga, the original leuA and leuBgenes within the leuABDC core seem to have been independ-ently displaced by bacterial orthologs without a clear affinity

ent phylo ((continued from previous page)Figure 1Genes with different phylogenetic affinities in a ribosomal operon from Aquifex aeolicus and Rickettsia prowazekii. (a) A fragment of ribosomal operon in Aquifex aeolicus (the operon from Thermotoga maritima is shown for comparison), Rickettsia prowazekii and Rickettsia conorrii (operons from other alpha-proteobacteria are shown for comparison). Genes are shown not to scale; the direction of transcription is indicated by arrows and gene numbers/names are given inside each arrow. Orthologous genes are shown by the same color. White arrows show genes in each genome that are unique in this operonic context. Phylogenetic affinity of a gene is shown as a thick colored border on the respective arrow; black denotes belonging to the reference taxon, red denotes not belonging to reference taxon. COG0197 - ribosomal protein L16/L10E; COG0255 - ribosomal protein L29; COG0186 - ribosomal protein S17. For species abbreviations, see Materials and methods. (b) Unrooted maximum-likelihood tree for ribosomal protein L16. Branches supported by bootstrap probability >70% are marked by black circles. Names of the genes from mosaic operons and the respective branches are shown in red. Branches for which the likelihoods of alternative placements were assessed using the RELL method are indicated by circles with numbers (see Table 3). (c) Unrooted maximum-likelihood tree for ribosomal protein L29;. the designations are as in Figure 1b. (d) Unrooted maximum-likelihood tree for ribosomal protein S17; the designations are as in Figure 1b.

Table 3

Kishino-Hasegawa test for the analyzed cases of apparent xenologous gene displacement in situ

Tree* Diff lnL† S.E.‡ RELL-BP§

L19 original 0.0 ML 0.8004

1R2 -12.6 7.7 0.0480

3R4 -6.6 6.6 0.1516

RuvB original 0.00 ML 0.9631

1R2 -27.1 15.4 0.0369

UppS original 0.00 ML 0.9883

1R2 -29.3 12.8 0.0117

NuoH original 0.00 ML 0.8336

1R2 -7.4 7.9 0.1664

RfbA original 0.00 ML 1.0000

1R2 -151.1 25.0 0.0000

RfbD original 0.00 ML 0.9005

1R2 -17.0 13.3 0.0995

LeuA original 0.00 ML 1.0000

1R2 -150.2 25.8 0.0000

3R4 -418.6 31.5 0.0000

5R6 -245.0 27.8 0.0000

LeuB original 0.00 ML 0.9847

1R2 -52.9 18.1 0.0007

3R4 -31.7 14.9 0.0146

LeuC original 0.00 ML 1.0000

1R2 -302.7 31.6 0.0000

3R4 -439.1 32.1 0.0000

LeuD original 0.00 ML 1.0000

1R2 -66.6 17.2 0.0000

3R4 -76.7 16.8 0.0000

*The numbers refer to local rearrangements of the tree as indicated on the corresponding figures. †Difference of the Log-likelihoods relative to the best tree. ‡Standard error of Diff lnL. §Bootstrap probability of the given tree calculated using the RELL method (Resampling of Estimated Log-likelihoods).

Genome Biology 2003, 4:R55

Page 10: Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ by marina v omelchenko kira s makarova† yuri i wolf igor b rogozin and eugene v koonin

R55.10 Genome Biology 2003, Volume 4, Issue 9, Article R55 Omelchenko et al. http://genomebiology.com/2003/4/9/R55

In situ displacement of the ruvB gene in MycoplasmaFigure 2In situ displacement of the ruvB gene in Mycoplasma. (a) Organization of the Holliday junction resolvasome operon and surrounding genes in bacteria. COG0632 - Holliday junction resolvasome, DNA-binding subunit, COG2255 - Holliday junction resolvasome, DNA-binding subunit, COG0817 - Holliday junction resolvasome, endonuclease subunit, COG0392 - Predicted integral membrane protein, COG0282 - acetate kinase, COG0839 - NADH:ubiquinone oxidoreductase subunit 6 (chain J), COG0244 - ribosomal protein L10, COG0732 - restriction endonuclease S subunits, COG0809 - S-adenosylmethionine:tRNA-ribosyltransferase-isomerase, COG0772 - bacterial cell division membrane protein, COG0624 - acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase and related deacylases, COG1487 - predicted nucleic acid-binding protein, COG1132 - ABC-type multidrug transport system, ATPase and permease components, COG0442 - prolyl-tRNA synthetase, COG0323 - DNA mismatch repair enzyme, COG1408 - predicted phosphohydrolases. The designations are as in Figure 1a. For species abbreviations, see Materials and methods. (b,c) Unrooted maximum-likelihood tree for RuvA (b) and RuvB (c); the designations are as in Figure 1b.

COG0632

10

NE0212 NeuNMB0265 Nme

RSc0501 RsoXF1904 Xfa

PA0966 PaeCBU1568 Bsp

VC1846 VchruvA Eco

PM0977 PmuHI0313 Hin

TM0165 TmaTP0543 Tpa

BB0023 BbuFN1104 Fnu

CAC2285 CacSA1468 Sau

lin1568 LinBS ruvA Bsu

BH1224 BhaL0266 Lla

SPy2119 SpySP0179 Spn

MYPU 6570 MpuUU449 Uur

MPN535 MpnMG358 Mge

CPn0620 CpnCT501 Ctr

Cj0799c CjeHP0883 Hpy

CT0262 CteDR1274 Dra

Cgl1621 CglRv2593c MtuML0482 Mle

RP385 RprRC0531 Rco

CC3237 Ccrmll3898 Mlo

BMEI0333 BmeAGl2223 Atu

SMc03966 SmeTTE1179 Tte

all0375 Nspsll0876 Ssp

MPN534 MPN536

MG357 MG359

UU450 UU448

COG0632ruvA

COG2255ruvB

L0276 L0267

RP384 RP386

CC3238 CC3236

MPN535

MG358

UU449

L0266

RP385

CC3237

MPN537

MG360

UU447

L70850

RP387

CC3235

COG0839COG0282

mll3899 mll3895mll3898 mll3894

bofG ruvBruvA queA

COG0809

MPN538

MG361

COG0244

UU446UU451

MP533

COG0732COG0392 COG0772 COG0624

mll3900mll3901

COG0817 COG1487

COG1408COG0323

UU452

COG0442COG1132

10

CC3236 Ccre

mll3895 Mlo

BMEI0334 Bme

AGl2225 AtuSMc03965 Sme

RP386 Rpr

RC0533 Rco

MYPU 6580 Mpu

UU448 UurMPN536 Mpn

MG359 Mge

Cj1362 Cje

HP1059 HpyNE0213 Neu

NMB1243 Nme

RSc0500 Rso

XF1902 XfaCBU1570 Oih

PA0967 Pae

VC1845 Vch

ruvB EcoPM0976 Pmu

HI0312 Hin

DR0596 Dra

BB0022 BbuTP0162 Tpa

CT1630 Cte

CT040Ctr

CPn0390 Cpn

Cgl1620 CglML0483 Mle

Rv2592c Mtu

TM1730 Tma

TTE1180 TteCAC2284 Cac

FN1217 Fnu

SA1467 Sau

BS ruvBm BsuBH1225 Bha

lin1567 Lin

L0267 Lla

SP0259 SpnSPy0038 Spy

all2894 Nsp

sll0613 Ssp

1

2

COG2255

Lla

Rpr

Ccr

Mpn

Mge

Uur

Bsu

Mlo

(a)

(b) (c)

Genome Biology 2003, 4:R55

Page 11: Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ by marina v omelchenko kira s makarova† yuri i wolf igor b rogozin and eugene v koonin

http://genomebiology.com/2003/4/9/R55 Genome Biology 2003, Volume 4, Issue 9, Article R55 Omelchenko et al. R55.11

com

ment

reviews

reports

refereed researchdepo

sited researchinteractio

nsinfo

rmatio

n

Genes with different phylogenetic affinities in the lipid biosynthesis operon of RickettsiaFigure 3Genes with different phylogenetic affinities in the lipid biosynthesis operon of Rickettsia. (a) Organization of the lipid biosynthesis operon and surrounding genes in Rickettsia prowazekii and Rickettsia conorrii (operons from three other alpha-proteobacteria are shown for comparison). COG0020 - undecaprenyl pyrophosphate synthase, UppS; COG0575 - CDP-diglyceride synthetase; COG0750 - predicted membrane-associated Zn-dependent proteases; COG0233 - ribosome recycling factor; COG0528 - uridylate kinase; COG0745 - OmpR-like response regulator; COG0642 - signal transduction histidine kinase; COG0729 - outer membrane protein; COG2919 - septum formation initiator; COG0743 - 1-deoxy-D-xylulose 5-phosphate reductoisomerase. The designations are as in Figure 1a. For species abbreviations, see Materials and methods. (b,c) Unrooted maximum-likelihood tree for UppS (b) and CdsA (c); the designations are as in Figure 1b.

COG0020

10

aq 1248 AaeBB0120 Bbu

APE1385 ApeCj0824 Cje

TM1398 TmaCAC1791 Cac

TTE1404 TteL183602 Lla

SPy1965 SpySP0261 Spn

SA1103 Saulin1352 Lin

BS yluA BsuBH2423 Bha

OB1590 OihCgl2231 Cgl

Rv2361c MtuML0634 Mle

CC1919 Ccrmll0640 Mlo

BMEI0827 BmeAGc2550 Atu

SMc02097 SmeCT0267 Cte

XF1050 XfaVC2256 Vch

BU236 BspyaeS Eco

YPO1049 YpeHI0920 HinPM1989 Pmu

PA3652 PaeNMB0186 Nme

RSc1408 RsoNE1714 Neu

all3432 Nspsll0506 Ssp

tlr1763 TelHP1221 Hpy

CT450 CtrCPn0566 Cpn

FN1326 FnuTP0603 Tpa

RP425 RprRC0590 Rco

MK0767 MkaPH1590 Pho

PAB0394 PabAF1219 Afu

VNG1779C HspMA3723 Mac

MTH232 MthMJ1372 Mja

DR2447 DraCgl0966 Cgl

Rv1086 MtuML2467 Mle

SSO0163 SsoTa0546 Tac

TVN0600 Tvo

1

2

CC1921 CC1920 CC1919 CC1918 CC1917 CC1916

RP425 RP424 RP423

mll0643 mll0642 mll0640 mll0639 mll0638 mll0637

AGc2544 AGc2546 AGc2550 AGc2551 AGc2553

Rpr

Ccr

Mlo

Atu

COG0575

CdsA

Rco RC0590 RC0589 RC0588RC0591RC0592RC0593

RP426RP427

Eco

Cac

Spy

yaeS cdsA yaeTyaeL

CAC1791 CAC1792 CAC1794CAC1793

Spy1965 Spy1964 Spy1963

COG0020

UppS

COG0750 COG0729COG0642COG0745 COG0233 COG0528 COG2919 COG0743

COG0575

10

FN1325 Fnuaq 1249 AaeTM1397 Tma

CAC1792 CacTTE1403 Tte

L182799 LlaSPy1964 Spy

SP0262 SpnSA1104 Sau

lin1353 LinOB1591 Oih

BS cdsA BsuBH2422 Bha

mll0639 MloBMEI0828 Bme

AGc2551 AtuSMc02096 Sme

RP424 RprRC0589 Rco

DR1509 DraCT0233

slr1369 Sspall3875 Nsp

tlr2108 TelCC1918 Ccr

CT451 CtrCPn0567 Cpn

PA3651 PaeVC2255 Vch

cdsA EcoYPO1050 Ype

HI0919 HinPM1990 Pmu

XF1049 XfaNMB0185 Nme

RSc1409 RsoNE1713 Neu

BB0119 BbuHP0215 Hpy

Cj1347c CjeTP0602 Tpa

Cgl1975 GglRv2881c Mtu

ML1589 MleSSO0776 Sso

MA3306 MacMK1073 Mka

APE0433 ApeAF1740 Afu

VNG2119C HspMJ1600 Mja

MTH832 MthPH0343 Pho

PAB1285 PabTa0107 TacTVN0186 Tvo

(a)

(b) (c)

Genome Biology 2003, 4:R55

Page 12: Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ by marina v omelchenko kira s makarova† yuri i wolf igor b rogozin and eugene v koonin

R55.12 Genome Biology 2003, Volume 4, Issue 9, Article R55 Omelchenko et al. http://genomebiology.com/2003/4/9/R55

Figure 4 (see legend on next page)

VNG0635G VNG0636G VNG0637G_1 VNG0639G VNG0640G VNG0641C VNG0642C VNG0643G VNG0646G VNG0647G

SSO0322 SSO0665 SSO0323 SSO0325 SSO0326 SSO0327 SSO0328_1 SSO0328_2 SSO0329

AF1828 AF1829_1 AF1829_2 AF1831 AF1832a AF1823 AF1826 AF1827

Ta0970m Ta0969m Ta0968 Ta0966 Ta0965 Ta0964 Ta0963 Ta0962 Ta0961 Ta0960 Ta0959

nuoA nuoB nuoC_1 nuoH nuoI nuoJ nuoK nuoL nuoM nuoN

mll1372 mll1371 mll1369 mll1361 mll1359 mll1358 mll1357 mll1355 mll1354 mll1352

Hsp

Ssa

Afu

Tac

Eco

Mlo

VNG0637G_2

Ta0967m

AF1830

SSO0324

nuoC_2

mll1367

nuoE nuoF nuoG

mll1366 mll1365 mll1364 mll1362

COG1005

10

VNG0639G Hsp

tlr0667Tel

sll0519 Ssp

alr0223 Nsp

PAE1581 Pae

AF1831 Afu

SSO0325 Sso

APE1421 Ape

MA1499 Mac

TVN1112 Tvo

Ta0966 Tac

CT0770 Cte

Cj1572c Cje

HP1267 Hpy

aq 1315 Aae

PA2643 Pae

BU160 Bsp

nuoH Eco

YPO2549 Ype

DR1498 Dra

Rv3152 Mtu

RP796 Rpr

RC1230 Rco

CC1945 Ccr

mll1361 Mlo

BMEI1151 Bme

AGc2354 Atu

SMc01921 Sme

NE1770 Neu

NMB0250 Nme

XF0312 Xfa

RSc2055 Rso

COG1143

10

CT0771 Cte

aq 1317 Aae

aq 1375 Aae

VNG0640G HspRP795 Rpr

RC1229 Rco

CC1942 Ccr

AGc2355 Atu

SMc01922 Sme

mll1359 Mlo

BMEI1150 Bme

XF0313 Xfa

NMB0251 Nme

RSc2054 Rso

NE1769 Neu

Cj1571c Cje

HP1268 Hpy

PA2644 Pae

BU161 Bsp

nuoI Eco

YPO2548 Ype

DR1497 Dra

Rv3153 Mtu

TVN1111 Tvo

Ta0965 Tac

TM1217 1 Tma

PH1446 Pho

PAB0496 Pab

MA1500 Mac

MTH1240 Mth

MJ1302 Mja

SSO0326 Sso

APE1419 Ape

AF1832a Afu

PAE1580 Pae

tlr0668 Tel

sll0520 Ssp

alr0224 Nsp

1

2

COG0839

10

AF1823 Afu

VNG0641C 06442c Hsp

SSO0327 Sso

APE1418 Ape

TVN1110 1109 Tvo

Ta0964 09663 Tac

CT0772 Cte

MA1501 1502 Mac

tlr0669 Tel

sll052 Ssp1

alr0225 Nsp

DR1496 Dra

Rv3154

aq 1318 Aae

aq 1377 Aae

PA2645 Pae

YPO2547 Ype

BU162 Bsp

nuoJ Eco

Cj1570c Cje

HP1269 Hpy

XF0314 Xfa

NMB0253 Nme

RSc2053 Rso

NE1768 Neu

RP790 Rpr

RC1224 Rco

CC1941 Ccr

BMEI1149 Bme

mll1358 Mlo

AGc2357 Atu

SMc01923 Sme

COG1005 COG1143 COG0839COG0852 COG0713COG0377COG0838 COG0649 COG1009 COG1008 COG1007COG1905 COG1894 COG1034(a)

(b) (c)

(d)

VNG0648G

Genome Biology 2003, 4:R55

Page 13: Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ by marina v omelchenko kira s makarova† yuri i wolf igor b rogozin and eugene v koonin

http://genomebiology.com/2003/4/9/R55 Genome Biology 2003, Volume 4, Issue 9, Article R55 Omelchenko et al. R55.13

com

ment

reviews

reports

refereed researchdepo

sited researchinteractio

nsinfo

rmatio

n

with any specific bacterial lineage (Figure 6a). The most likelyscenario for evolution of this operon in Thermotoga is that itoriginated as a Gram-positive type operon and subsequentlymany genes (or sub-operons) have been displaced in situthrough multiple horizontal transfers and a few additionalgenes have been inserted into the preexisting structure. Thealternative but less likely hypothesis involves independent, denovo operon assembly from genes of different phylogeneticaffinities. Several other apparent HGT events were detectedduring the analysis of the phylogenetic trees for leucine bio-synthesis genes (DR1614 in LeuD tree, DR1610 in LeuC tree(Figure 6d,e)) but, in these cases, the acquired genes do notbelong to conserved operons.

ConclusionsIntragenomic plasticity and inter-species horizontal mobilityof operons are thought to be important facets of prokaryoticgenome evolution. Indeed, the results presented here indicatethat horizontal transfer of entire operons is the most likelyexplanation for most of the findings of co-localized 'alien'genes in a genome, which is generally consistent with SOM.However, a substantial fraction - approximately 35% - ofoperons with indications of horizontal transfer events appearto consist of genes with different phylogenetic affinities. Bar-ring artifacts of phylogenetic analysis, which can never beruled out completely, but appear unlikely given the strongstatistical support for the anomalous placement of the genesin question in phylogenetic trees, two evolutionary scenariosfor the origin of such mosaic operons are conceivable. Thefirst involves de novo assembly of operons, in part from genesacquired via HGT, whereas the second one postulates in situxenologous displacement of genes within a resident operon.Analysis of mosaic operons suggested that both scenariosmight apply, but in situ displacement is likely to be more fre-quent. In several cases, in situ displacement seems to haveoccurred between genomes of distantly related parasitic bac-teria that might have shared a host. A sequence of events thatis often considered as an alternative to HGT is an ancientduplication with subsequent differential loss of paralogs.However, in the cases analyzed here, this seems to be a partic-ularly remote possibility because a tandem duplication fol-lowed by lengthy evolution of both paralogs within the operonwould be required to mimic in situ displacement. Tandem

pairs of paralogs are uncommon in operons and such a'smoking gun' was not observed in any of the suspected casesof in situ displacement.

At first glance, in situ gene displacement seems highlyunlikely: given the vast evolutionary distance separating thedonor and recipient genomes, homologous recombination isout of the question. In cases when the displacing gene(s) islocated on the periphery of an operon (for example, Figure5a), a plausible mechanism could involve initial insertion ofthe invading gene in the vicinity of the resident operon, fol-lowed by deletion of intervening genes (provided these arenon-essential). However, when the displacing gene is tuckedbetween resident ones (for example, Figures 4a, 6a),displacement must have occurred with surgical precision. Theonly conceivable explanation seems to be that HGT isextremely common in the evolution of prokaryotes and so isintragenomic recombination, which provides for rare chanceoccurrences of in situ displacement. Conceivably, a horizon-tally acquired gene that displaces the resident ortholog with-out disruption of operon organization would have its chancesof evolutionary fixation greatly increased, hence the apparentdisproportional survival of the displacing genes. Thisexplanation does not refute SOM as the conceptual frame-work explaining the origin of operons but emphasizes the'altruistic' aspect of the evolution of operons whereby theoperon integrity is maintained by strong purifying selectionat the organism level.

Materials and methodsSequence dataAmino acid sequences from 41 completely sequencedprokaryotic genomes were extracted from the Genome divi-sion of the Entrez retrieval system [27] and used as the masterspecies set for this analysis. Bacterial species abbreviations:Aquifex aeolicus (Aae), Bacillus halodurans (Bha), Bacillussubtilis (Bsu), Streptococcus pyogenes (Spy), Staphylococcusaureus (Sau), Clostridium acetobutylicum (Cac), Borreliaburgdorferi (Bbu), Campylobacter jejunii (Cje), Chlamydiatrachomatis (Ctr), Chlamydophila pneumoniae (Cpn), Dei-nococcus radiodurans (Dra), Escherichia coli (Eco), Haemo-philus influenzae (Hin), Helicobacter pylori (Hpy),Lactococcus lactis (Lla), Mesorhizobium loti (Mlo),

In situ gen((continued from previous page) Figure 4In situ gene displacement in the NADH-ubiquinone oxidoreductase operon in Halobacterium. (a) Organization of the NADH-ubiquinone oxidoreductase operon in selected archaeal and bacterial genomes. COG0838 - NADH:ubiquinone oxidoreductase subunit 3 (chain A), COG3077 - DNA-damage-inducible protein J, COG0852 - NADH:ubiquinone oxidoreductase 27 kD subunit, COG0649 - NADH:ubiquinone oxidoreductase 49 kD subunit 7, COG1905 - NADH:ubiquinone oxidoreductase 24 kD subunit, COG1894 - NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit, COG1034 - NADH dehydrogenase/NADH:ubiquinone oxidoreductase 75 kD subunit (chain G), COG1005 - NADH:ubiquinone oxidoreductase subunit 1 (chain H), COG1143 - Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I), COG0839 - NADH:ubiquinone oxidoreductase subunit 6 (chain J), COG0713 - NADH:ubiquinone oxidoreductase subunit 11 or 4L (chain K), COG1009 - NADH:ubiquinone oxidoreductase subunit 5 (chain L), COG1008 - NADH:ubiquinone oxidoreductase subunit 4 (chain M), COG1007 - NADH:ubiquinone oxidoreductase subunit 2 (chain N). The designations are as in Figure 1a. For species abbreviations, see Materials and methods. (b-d) Unrooted maximum-likelihood tree for NuoH (b), NuoI (c) and NuoJ (d); the designations are as in Figure 1b.

Genome Biology 2003, 4:R55

Page 14: Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ by marina v omelchenko kira s makarova† yuri i wolf igor b rogozin and eugene v koonin

R55.14 Genome Biology 2003, Volume 4, Issue 9, Article R55 Omelchenko et al. http://genomebiology.com/2003/4/9/R55

Figure 5 (see legend on next page)

NMB0756

AF0321AF0325 AF0324 AF0323b AF0323a

MTH1792 MTH1791 MTH1790 MTH1789

PAB0783 PAB0784 PAB0785 PAB7298 PAB7296 PAB0788

PH0414 PH0415 PH0416 PH0417

PAB0789PAB0787

AF0322

APE1181 APE1180 APE1179 APE1178

SSO0833 SSO0832 SSO0831 SSO0830

CAC2318 CAC2317 CAC2316 CAC2315 CAC2333 CAC2332 CAC2331

spsA spsB spsC spsDA spsE spsF spsG spsK_1

BH3366BH3363 BH3364 BH3365

spsI spsJ spsK_2

Spy0933 Spy0935 Spy0936Spy0784

DRA0040DRA0037 DRA0038 DRA0039 DRA0044DRA0041 DRA0042 DRA0043

rfbX glf wbbH wbbI wbbJ wbbK yefJrfbC

PA5164PA5161 PA5162 PA5163

rfbB rfbD rfbA

XF0258

NMB0082NMB0079 NMB0080 NMB0081

XF0255 XF0256 XF0257

trs

TVN0899 TVN0900 TVN0901 TVN0902

mlr7553mlr7550 mlr7551 mlr7552

SMb21327SMb21324 SMb21325 SMb21326

AG1532AG1529 AG1530 AG1531

CC3633CC1140 CC1141 CC3629

Rv3465Rv0334 Rv3266Rv3464

ML1965ML2503 ML0751ML1964

Rv3265 Rv3264

ML0752 ML0753

CAC2321 CAC2320 CAC2319

Mth

Cac

Bsu

Pab

Afu

Pho

Ape

Tvo

Sso

Bha

Spy

Dra

Eco

Pae

Xfa

Nme

Mlo

Sme

Atu

Ccr

Mtu

Mle

COG1091

RfbD

COG1209

RfbA

COG1898

RfbC

COG1088

RfbB

COG1088

10

SPy0936 Spy

BL0229 Blo

NCgl0327 Cgl

Rv3464 Mtu

ML1964 Mle

TVN0902 Tvo

PH0414 Pho

PAB0785 Pab

AF0324 Afu

SSO0830 Sso

SSO1781 Sso

APE1180 Ape

BS spsJ Bsu

BH3364 Bha

SCO0749 Sco

MA3779 Mac

DRA0041 Dra

rfbB Eco

STM2097 Sty

NMA0204 Nme

SO3167 Son

AGl530 Atu

CC3629 Ccr

SMb21326 Sme

mlr7552 Mlo

NE0469 Neu

PA5161 Pae

XF0255 Xfa

XC rmlb Xca

tll0458 Tel

slr0836 Ssp

CT rfbB Cte

CAC2332 Cac

FN1667 Fnu

lmo1083 Lmo

MTH1789 Mth

10

Rv3465 Mtu

ML1965 Mlo

SPy0935 Spy

SCO0400 Sco

TVN0901 Tvo

DRA0043 Dra

BS spsK 2 Bsu

APE1178 Ape

SSO0833 Sso

LA1659 Lint

AGl529 Atu

XCC2013 Xca

SMb21325 Sme

mlr7551 Mlo

PH0416 Pho

PAB0787 Pab

CC3633 Ccr

PA5164 Pae

Neur0612 Neu

slr1933 Ssp

tll0456 Tel

NMB0081 Nme

NMA0206 Nme

SO3160 Son

CAC2331 Cac

MTH1790 Mth

XF0257 Xfa

rfbC CT Cte

MA3780 Mac

EF2193 Efe

lp 1188 Lpl

lmo1082 Lmo

rfbC Eco

STM2094 Sty

rmlC VC Vch

69

COG1898

(a)

(b) (c)

Genome Biology 2003, 4:R55

Page 15: Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ by marina v omelchenko kira s makarova† yuri i wolf igor b rogozin and eugene v koonin

http://genomebiology.com/2003/4/9/R55 Genome Biology 2003, Volume 4, Issue 9, Article R55 Omelchenko et al. R55.15

com

ment

reviews

reports

refereed researchdepo

sited researchinteractio

nsinfo

rmatio

n

Mycoplasma genitalium (Mge), Mycoplasma pneumoniae(Mpn), Mycobacterium tuberculosis (Mtu), Mycobacteriumleprae (Mle), Pasteurella multocida (Pmu), Neisseria menin-gitidis (Nme), Pseudomonas aeruginosa (Pae), Rickettsiaprowazekii (Rpr), Rickettsia conorii (Rco), SynechocystisPCC6803 (Ssp), Thermotoga maritima (Tma), Treponemapallidum (Tpa), Vibrio cholerae (Vch), Xylella fastidiosa(Xfa), Buchnera sp. (Bsp), Caulobacter crescentus (Ccr), andUreaplasma urealyticum (Uur). Archaeal species abbrevia-tions: Aeropyrum pernix (Ape), Archaeoglobus fulgidus(Afu), Halobacterium sp. (Hsp), Methanothermobacter ther-moautotrophicum (Mth), Methanococcus jannaschii (Mja),Pyrococcus horikoshii (Pho), Pyrococcus abyssi (Pab), Ther-moplasma volcanium (Tvo), Thermoplasma acidophilum(Tac), Sulfolobus solfataricus (Sso). In addition, the follow-

ing species were included in the case studies described in thetext; bacteria: Agrobacterium tumefaciens (Atu), Bifidobac-terium longum (Blo), Brucella melitensis (Rso), Chlorobiumtepidum (Cte), Enterococcus faecalis (Efa), Fusobacteriumnucleatum (Fnu), Lactobacillus plantarum (Lpl), Leptospirainterrogans serovar (Lint), Listeria innocua (Lin), Listeriamonocytogenes (Lmo), Nitrosomonas europaea (Neu), Nos-toc sp. (Nsp), Oceanobacillus iheyensis (Oih), Ralstoniasolanacearum (Rso), Sinorhizobium meliloti (Sme), Strepto-myces coelicolor (Sco), Thermoanaerobacter tengcongensis(Tte), Thermosynechococcus elongatus (Tel), Xanthomonascampestris (Xca), Shewanella oneidensis (Son); archaea:Methanopyrus kandleri (Mka), Methanosarcina acetivorans(Mac), Pyrobaculum aerophilum (Pae), Pyrococcus furiosus(Pfu).

Genes wih(continued from previous page) radioduransFigure 5Genes with different phylogenetic affinities in the lipopolysaccharide biosynthesis operon of Methanothermobacter thermoautotrophicus and Deinococcus radiodurans. (a) Organization of the lipopolysaccharide biosynthesis operon in different prokaryotes. COG1091 - dTDP-4-dehydrorhamnose reductase; COG1209 dTDP-glucose pyrophosphorylase; COG1898 - dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes; COG1088 - dTDP-D-glucose 4,6-dehydratase. The designations are as in Figure 1a. For species abbreviations, see Materials and methods. (b-e) Unrooted maximum-likelihood tree for RfbB (b), RfbC (c), RfbA (d) and RfbD (e); the designations are as in Figure 1b.

COG1209

10

TM0862 Tma

sll0207 Ssp

PAB0784 Pab

PH0413 Pho

TVN0899 TVo

APE1181 Ape

SSO0831 Sso

AF0325 Afu

MA3025 Mac

BS spsI Bsu

BH3363 Bha

CAC2333 Cac

MTH1791 Mth

rlmA CJ Cje

lp 1186 Lpl

SPy0933 Spy

L197041 Lla

EF2194 Efe

lmo1081 Lmo

rfbA Eco

STM2095 Sty

PA5163 Pae

Neur0611 Neu

LA1662 Lint

DRA0042 Dra

AGl532 Atu

CC1141 Ccr

XF0256 Xfa

tll0457 Tel

SMb21324 Sme

mlr7550 Mlo

NMB0080 Nme

rmlA VC Vch

rfbA CT Cte

NCgl0325 Cgl

Rv0334 Mtu

ML2503 Mle

1

2

10

rfbD Eco

STM2096 Sty

rfbD NE Neu

PA5162 Pae

XF0258 Xfa

tlr0951 Tel

SMb21327 Sme

mlr7553 Mlo

AGl531 Atu

CC1140 Ccr

NMB0756 Nme

rmlD VC Vch

sll1395 Ssp

rfbD CT Cte

Rv3266c Mtu

ML0751 Mle

NCgl0326 Cgl

SPy0784 SPy

lp 1190 Lpl

TVN0900 Tvo

SSO0832 Sso

APE1179 Ape

DRA0044 Dra

PH0417 Pho

AF0323a Afu

PAB0789 Pab

MTH1792 Mth

FNV1769 Fnu

CAC2315 Cac

BS spsK 1 Bsu

BH3365 Bha

MA3778 Mac

COG1091

2

1

(d) (e)

Genome Biology 2003, 4:R55

Page 16: Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ by marina v omelchenko kira s makarova† yuri i wolf igor b rogozin and eugene v koonin

R55.16 Genome Biology 2003, Volume 4, Issue 9, Article R55 Omelchenko et al. http://genomebiology.com/2003/4/9/R55

Figure 6 (see legend on next page)

PAB0888 PAB0889 PAB0890 PAB0891 PAB0892 PAB0893 PAB0894 PAB0895

TM0548 TM0549 TM0550 TM0551 TM0552 TM0553 TM0554 TM0555 TM0556

PAB0286 PAB0287 PAB0288 PAB0289

PH1727 PH1726 PH1724 PH1722

ilvB ilvN ilvC leuA leuB leuC leuD

BH3062 BH3061 BH 3060 BH3059 BH3058 BH3057 BH3056

COG0028ilvB

COG0440ilvN

COG0059ilvC

COG0129ilvD

COG0119leuA

COG0065leuC

COG0473leuB

Tma

Pab

Pho

Bsu

ilvI leuO leuD leuL leuA leuB leuCEco

HI0986 HI0987Hin, Cj, Vch, Pae, Buc_plasmid, Pmu, HI0988 HI0989

TM0290 TM0291 TM0292 TM0293 TM0294

CAC3176 CAC3174 CAC3173 CAC3172 CAC3171 CAC3170 CAC3169

BH3055

Ca

ilvH

SA1858 SA1859 SA1860 SA1861 SA1862 SA1863 SA1864 SA1885 SA1866Sau

MTH1386 MTH1387 MTH1388 MTH1389Mth

COG0066leuD

Pab

Pfu

Pfu

PF1678 PF1679 PF1680

PF0935 PF0936 PF0937 PF0938 PF0939 PF0940 PF0941 PF0942

Tma

Bha

COG0119

10

PAE1993 PaeTTE0016 Tte

TM0553 TmaSA1862 SauL0073 Lla

BS leuA BsuBH3058 Bha

DR1482 Draaq 2090 Aae

NMB1070 NmeXF1818 Xfa

CC1541 CgrCj1719c Cje

BUpL04 BspleuA Eco

YPO0533 YpeVC2490 VchSO4236 SonHI0986 Hin

PM1962 PmuLA2202 Lint

slr0186 Ssptll1397 Telalr4840 Nsp

PAB0890 PabPF0937 Pfu

MK0391 MkaMJ1195 Mja

MTH1481 MthMA4615 Mac

AF0219 AfuMK0275 MkaMJ1392 Mja

MTH723 MthMA3793 Mac

AF0957 AfuMA3342 Mac

MK1209 MkaMJ0503 Mja

MTH1630 MthPH1727Pho

PAB0286 PabPF1678 Pfu

PAE1986 PaeSSO0977 Sso

PAB0894 PabPF0941 Pfu

aq 356 AaeTM0552 Tma

leuA 1 CT CteSCO5529 Sco

sll1564 Ssptlr0850 Tel

alr3522 NspCAC3174 Cac

SMc02546 Smemlr7805 Mlo

PAE1330 PaeSSO0127 Sso

TTE0472 Ttealr1407 Nsp

2

1

4,6

3

5

COG0473

10

aq 244 AaeTM0556 Tma

LA2152 LintCj1718c Cje

BUpL05 BspVC2491 VchleuB EcoYPO0532 Ype

SO4235 SonHI0987 Hin

PM1961 PmuCAC3171 Cac

TTE0019 TteleuB CT Cte

slr1517 Sspalr1313 Nsp

tlr1600TelPA3118 Pae

XF2372 XfaNMB1031 Nme

Neur0621 NeuCC0193 Ccr

AGc5067 AtuSMc04405 Sme

mll4399 MloBS leuB Bsu

BH3057 BhaDR1785 Dra

L0074 LlaSA1863 Sau

lmo1988 ImoMJ0720 Mja

MTH1388 MthMTH184 Mth

MK0782 MkaSSO0723 Sso

AF0628 AfuMJ1596 Mja

PH1722 PhoPAB0289 Pab

MA0201 MacPAB0893 Pab

PF0940 PfuNCgl1237 Cgl

SCO5522 ScoRv2995c Mtu

ML1691 Mle

3

4

2

1

(a)

(b) (c)

Genome Biology 2003, 4:R55

Page 17: Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ by marina v omelchenko kira s makarova† yuri i wolf igor b rogozin and eugene v koonin

http://genomebiology.com/2003/4/9/R55 Genome Biology 2003, Volume 4, Issue 9, Article R55 Omelchenko et al. R55.17

com

ment

reviews

reports

refereed researchdepo

sited researchinteractio

nsinfo

rmatio

n

Reconstruction of gene neighborhoodsGene neighborhoods for the 41 compared genomes werereconstructed as previously described [25]. Briefly, the collec-tion of clusters of orthologous groups of proteins from com-plete genomes (COGs) [28] was used as the source ofinformation on orthologous relationships for detecting con-served gene pairs. For the purpose of this analysis only 'highlyconserved' gene pairs were considered, that is, those formedby genes from two COGs that were present in the same orien-tation and separated by less than three genes in at least 10 ofthe compared genomes. This conservative approach wasadopted in order to ensure that all analyzed gene pairs belongto the same operon. At the next step, overlapping gene pairswere joined in triplets; each triplet was required to exist in atleast one genome. Overlapping triplets were used to constructgene arrays by run search in an oriented graph; a gene arraymay or may not be found in its entirety in any availablegenome. Finally, gene arrays that shared at least three COGswere clustered into neighborhoods by using a single-linkage

clustering algorithm [25]. Conserved gene pairs that did notbelong to the reconstructed gene arrays were also analyzed.

Searching for candidate horizontally transferred genesThe protein sequences encoded by the genes of each neigh-borhood were searched against the non-redundant proteinsequence database (NCBI, NIH, Bethesda) using the BLASTPprogram. The BLAST hits were analyzed to identify theirpotential phylogenetic affinity. For each protein, the best hitswere identified to the taxon to which the given speciesbelongs (hereinafter, reference taxon) and to other majortaxa; hits to closely related species were disregarded (seeTable 1S in the additional data file). Proteins that had moresignificant (lower E-value) hits to a non-reference taxon thanto the reference taxon were considered candidates for hori-zontal transfer and the respective orthologous protein clus-ters were subject to further phylogenetic analysis as describedin the next section. If phylogenetic analysis indicated that aparticular gene was likely to be horizontally transferred,

Genes wit((continued from previous page)Figure 6Genes with different phylogenetic affinities in the leucine/isoleucine biosynthesis operon. (a) Operon organization in different prokaryotic species. COG0028 - acetolactate synthase, large subunit; COG0440 - acetolactate synthase, small subunit; COG0059 - ketol-acid reductoisomerase; COG0129 - dihydroxyacid dehydratase; COG0119 - isopropylmalate synthases; COG0473 - isocitrate/isopropylmalate dehydrogenase; COG0066 - 3-isopropylmalate dehydratase, small subunit; COG0065 - 3-isopropylmalate dehydratase, large subunit. The designations are as in Figure 1a. For species abbreviations, see Materials and methods. (b-e) Unrooted maximum-likelihood tree for LeuA (b), LeuB (c), LeuC (d) and LeuD (e); the designations are as in Figure 1b.

COG0065

10

DR1778 DraRv2988c MtuML1685 Mle

SCO5553 ScoNCgl1262 Cgl

BH3056 BhaBS leuC Bsu

L0075 LlaSA1864 Sau

sll1470all1417 Nsp

tlr0909 NspLA2095 Lint

VC2492 VchleuC Eco

YPO0531 YpeHI0988 Hin

PM1960 PmuBUpL06 Bsp

Cj1717c CjeNMB1036 Nme

Neur0619 NeuPA3121 PaeXF2375 Xfa

CC0196 CcrAGc4910 AtuSMc03823 Sme

mll4272 MloCT0614 Cte

MK1208 MkaPF1679 Pfu

PH1726 PhoPAB0287 Pab

DR1610 DraSSO2471 Sso

PAE1984 PaeTM0291 TmaAF2199 Afu

MJ1003 MjaMTH1631Mth

MA3085 MacMK1440 Mka

MJ0499 MjaMTH1386 Mth

MA1393 MacAF1963 Afu

TM0554 Tmaaq 940 AaePAB0891 Pab

PF0938 PfuCAC3173 Cac

TTE0017 Tte

2,4

1

3

COG0066

10

SCO5554 ScoNCgl1263 Cgl

Rv2987c MtuML1684 Mle

Cj1716c CjeL0076 Lla

SA1865 SauBH3055 Bha

BS leuD BsuCC0195 Ccr

AGc5065 AtuSMc03795 Sme

mll4408 MloLA2096 Lint

BUpL07 BspleuD EcoYPO0530 Ype

HI0989 HinVC2493 Vch

PM1959 PmuNeur0620 Neu

NMB1034 NmePA3120 Pae

XF2374 Xfasll1444 Ssp

tlr1234 Telall1416 Nsp

CT0613 CteDR1784 Dra

MK1206 MkaMK0781 Mka

AF0629 AfuMTH1387 MthMJ1277 Mja

MTH829 MthMJ1271 Mja

SSO2470 SsoPAE1991 Pae

MA1223 MacMA3751 Mac

AF1761 AfuTM0292 Tma

DR1614 DraPAB0288 Pab

PH1724 PhoPF1680 Pfu

aq 1398 AaeCAC3172 CacTM0555 Tma

TTE0018 TtePAB0892 PabPF0939 Pfu

1

3

2,4

(d) (e)

Genome Biology 2003, 4:R55

Page 18: Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ by marina v omelchenko kira s makarova† yuri i wolf igor b rogozin and eugene v koonin

R55.18 Genome Biology 2003, Volume 4, Issue 9, Article R55 Omelchenko et al. http://genomebiology.com/2003/4/9/R55

phylogenetic trees were built also for the genes predicted tobelong to the same operon. When different phylogeneticaffinities were found for genes of the same predicted operon,this operon was considered to be 'mosaic'.

Phylogenetic analysisMultiple protein sequence alignments were constructed usingthe T-Coffee program [29] and positions containing >70%gaps were excluded. Distance trees were constructed by usingthe least-square method as implemented in the FITCH pro-gram of the PHYLIP package [30,31]. The least-square treeswere subjected to maximum-likelihood local rearrangementusing the ProtML program of the MOLPHY package, with theJTT-F model of amino acid substitutions [32,33]. The result-ing trees are a surrogate for maximum-likelihood phyloge-nies; exhaustive maximum-likelihood tree construction isimpractical for the number of species analyzed here. Boot-strap analysis was performed for each maximum-likelihoodtree using the Resampling of Estimated Log-Likelihoods(RELL) method as implemented in MOLPHY [32-34]. Alter-native placements of selected clades in maximum-likelihoodtrees were compared by using the rearrangement optimiza-tion (Kishino-Hasegawa) method as implemented in theProtML program [34].

Additional data fileAdditional data, including schematics of operon organizationand phylogenetic trees for all gene clusters listed in Table 2,are available in an additional data file (Additional data file 1).Additional data file 1Additional data, including schematics of operon organization and phylogenetic trees for all gene clusters listed in Table 2Additional data, including schematics of operon organization and phylogenetic trees for all gene clusters listed in Table 2Click here for additional data file

AcknowledgementsWe thank Jeffrey Lawrence for critical reading of the manuscript. Marina V.Omelchenko is supported by a grant from the US Department of Energy(Office of Biological and Environmental Research, Office of Science) grantsDE-FG02 01ER63220 from the Genomes to Life Program.

References1. Jacob F, Monod J: Genetic regulatory mechanisms in the syn-

thesis of proteins. J Mol Biol 1961, 3:318-356.2. Miller JH, Reznikoff WSE: The Operon. Cold Spring Harbor, NY: Cold

Spring Harbor Laboratory; 1978. 3. Mushegian AR, Koonin EV: Gene order is not conserved in bac-

terial evolution. Trends Genet 1996, 12:289-290.4. Dandekar T, Snel B, Huynen M, Bork P: Conservation of gene

order: a fingerprint of proteins that physically interact. TrendsBiochem Sci 1998, 23:324-328.

5. Watanabe H, Mori H, Itoh T, Gojobori T: Genome plasticity as aparadigm of eubacteria evolution. J Mol Evol 1997, 44:S57-S64.

6. Itoh T, Takemoto K, Mori H, Gojobori T: Evolutionary instabilityof operon structures disclosed by sequence comparisons ofcomplete microbial genomes. Mol Biol Evol 1999, 16:332-346.

7. Wolf YI, Rogozin IB, Kondrashov AS, Koonin EV: Genome align-ment, evolution of prokaryotic genome organization, andprediction of gene function using genomic context. GenomeRes 2001, 11:356-372.

8. Lawrence JG: Shared strategies in gene organization amongprokaryotes and eukaryotes. Cell 2002, 110:407-413.

9. Lawrence JG, Roth JR: Selfish operons: horizontal transfer maydrive the evolution of gene clusters. Genetics 1996, 143:1843-1860.

10. Lawrence J: Selfish operons: the evolutionary impact of gene

clustering in prokaryotes and eukaryotes. Curr Opin Genet Dev1999, 9:642-648.

11. Lawrence JG: Selfish operons and speciation by gene transfer.Trends Microbiol 1997, 5:355-359.

12. Gray GS, Fitch WM: Evolution of antibiotic resistance genes:the DNA sequence of a kanamycin resistance gene from Sta-phylococcus aureus. Mol Biol Evol 1983, 1:57-66.

13. Koonin EV, Makarova KS, Aravind L: Horizontal gene transfer inprokaryotes - quantification and classification. Annu RevMicrobiol 2001, 55:709-742.

14. Aravind L, Tatusov RL, Wolf YI, Walker DR, Koonin EV: Evidencefor massive gene exchange between archaeal and bacterialhyperthermophiles. Trends Genet 1998, 14:442-444. A publishederratum appears in Trends Genet 1998, 15:41.

15. Jain R, Rivera MC, Lake JA: Horizontal gene transfer amonggenomes: the complexity hypothesis. Proc Natl Acad Sci USA1999, 96:3801-3806.

16. Nelson KE, Clayton RA, Gill SR, Gwinn ML, Dodson RJ, Haft DH,Hickey EK, Peterson JD, Nelson WC, Ketchum KA, et al.: Evidencefor lateral gene transfer between archaea and bacteria fromgenome sequence of Thermotoga maritima. Nature 1999,399:323-329.

17. Doolittle WF: Lateral genomics. Trends Cell Biol 1999, 9:M5-M8.18. Garcia-Vallve S, Romeu A, Palau J: Horizontal gene transfer in

bacterial and archaeal complete genomes. Genome Res 2000,10:1719-1725.

19. Gogarten JP, Doolittle WF, Lawrence JG: Prokaryotic evolution inlight of gene transfer. Mol Biol Evol 2002, 19:2226-2238.

20. Makarova KS, Ponomarev VA, Koonin EV: Two C or not two C:recurrent disruption of Zn-ribbons, gene duplication,lineage-specific gene loss, and horizontal gene transfer inevolution of bacterial ribosomal proteins. Genome Biol 2001,2:research0033.1-0033.14.

21. Brochier C, Philippe H, Moreira D: The evolutionary history ofribosomal protein RpS14: horizontal gene transfer at theheart of the ribosome. Trends Genet 2000, 16:529-533.

22. Brochier C, Bapteste E, Moreira D, Philippe H: Eubacterial phylog-eny based on translational apparatus proteins. Trends Genet2002, 18:1-5.

23. Davies RL, Campbell S, Whittam TS: Mosaic structure and molec-ular evolution of the leukotoxin operon (lktCABD) in Man-nheimia (Pasteurella) haemolytica, Mannheimia glucosida, andPasteurella trehalosi. J Bacteriol 2002, 184:266-277.

24. Schnaitman CA, Klena JD: Genetics of lipopolysaccharide bio-synthesis in enteric bacteria. Microbiol Rev 1993, 57:655-682.

25. Rogozin IB, Makarova KS, Murvai J, Czabarka E, Wolf YI, Tatusov RL,Szekely LA, Koonin EV: Connected gene neighborhoods inprokaryotic genomes. Nucleic Acids Res 2002, 30:2212-2223.

26. Snel B, Lehmann G, Bork P, Huynen MA: STRING: a web-serverto retrieve and display the repeatedly occurring neighbour-hood of a gene. Nucleic Acids Res 2000, 28:3442-3444.

27. Entrez Genome [http://www.ncbi.nlm.nih.gov:80/PMGifs/Genomes/org.html]

28. Tatusov RL, Natale DA, Garkavtsev IV, Tatusova TA, ShankavaramUT, Rao BS, Kiryutin B, Galperin MY, Fedorova ND, Koonin EV: TheCOG database: new developments in phylogenetic classifica-tion of proteins from complete genomes. Nucleic Acids Res2001, 29:22-28.

29. Notredame C, Higgins DG, Heringa J: T-Coffeee: a novel methodfor fast and accurate multiple sequence alignment. J Mol Biol2000, 302:205-217.

30. Fitch WM, Margoliash E: Construction of phylogenetic trees. Sci-ence 1967, 155:279-284.

31. Felsenstein J: Inferring phylogenies from protein sequences byparsimony, distance, and likelihood methods. Methods Enzymol1996, 266:418-427.

32. Hasegawa M, Kishino H, Saitou N: On the maximum likelihoodmethod in molecular phylogenetics. J Mol Evol 1991, 32:443-445.

33. Adachi J, Hasegawa M: MOLPHY: programs for molecular phy-logenetics. In Computer Science Monographs 27 Tokyo: Institute ofStatistical Mathematics; 1992.

34. Kishino H, Miyata T, Hasegawa M: Maximum likelihood inferenceof protein phylogeny and the origin of chloroplasts. J Mol Evol1990, 31:151-160.

Genome Biology 2003, 4:R55