142
1 Supplementary Data and Methods for: Genome sequence of the metazoan plant-parasitic nematode Meloidogyne incognita Pierre Abad 1,2,3 , Jérôme Gouzy 4 , Jean-Marc Aury 5,6,7 , Philippe Castagnone-Sereno 1,2,3 , Etienne G.J. Danchin 1,2,3 , Emeline Deleury 1,2,3 , Laetitia Perfus-Barbeoch 1,2,3 , Véronique Anthouard 5,6,7 , François Artiguenave 5,6,7 , Vivian C. Blok 8 , Marie-Cécile Caillaud 1,2,3 , Pedro M. Coutinho 9 , Corinne Dasilva 5,6,7 , Francesca De Luca 10 , Florence Deau 1,2,3 , Magali Esquibet 11 , Timothé Flutre 12 , Jared V. Goldstone 13 , Noureddine Hamamouch 14 , Tarek Hewezi 15 , Olivier Jaillon 5,6,7 , Claire Jubin 5,6,7 , Paola Leonetti 10 , Marc Magliano 1,2,3 , Tom R. Maier 15 , Gabriel V. Markov 16,17 , Paul McVeigh 18 , Graziano Pesole 19,20 , Julie Poulain 5,6,7 , Marc Robinson-Rechavi 21,22 , Erika Sallet 23,24 , Béatrice Ségurens 5,6,7 , Delphine Steinbach 12 , Tom Tytgat 25 , Edgardo Ugarte 5,6,7 , Cyril van Ghelder 1,2,3 , Pasqua Veronico 10 , Thomas J. Baum 15 , Mark Blaxter 26 , Teresa Bleve-Zacheo 10 , Eric L. Davis 14 , Jonathan J. Ewbank 27 , Bruno Favery 1,2,3 , Eric Grenier 11 , Bernard Henrissat 9 , John T. Jones 8 , Vincent Laudet 16 , Aaron G. Maule 18 , Hadi Quesneville 12 , Marie-Noëlle Rosso 1,2,3 , Thomas Schiex 24 , Geert Smant 25 , Jean Weissenbach 5,6,7 , Patrick Wincker 5,6,7 1 INRA, UMR 1301, 400 route des Chappes, F-06903 Sophia-Antipolis, France. 2 CNRS, UMR 6243, 400 route des Chappes, F-06903 Sophia-Antipolis, France. 3 UNSA, UMR 1301, 400 route des Chappes, F-06903 Sophia-Antipolis, France. 4 Laboratoire Interactions Plantes Micro-organismes, UMR441/2594, INRA/CNRS, Chemin de Borde Rouge, BP 52627, F- 31320 Castanet Tolosan, France. 5 Genoscope (CEA), 2 rue Gaston Crémieux, CP5706, F- 91057 Evry, France. 6 CNRS, UMR 8030, 2 rue Gaston Crémieux, CP5706, F-91057 Evry, France. 7 Université d'Evry, F-91057 Evry, France. 8 Plant Pathology Programme, SCRI, Invergowrie, Dundee, DD2 5DA, UK. 9 CNRS, UMR 6098 CNRS and Universites of Aix- Marseille I & II, Case 932, 163 Av. de Luminy, F-13288 Marseille, France. 10 Istituto per la Protezione delle Piante, Consiglio Nazionale delle Ricerche, Via G. Amendola 165/a, 70126, Bari, Italy. 11 INRA, Agrocampus Rennes, Univ. Rennes I, UMR1099 BiO3P, Domaine de la Motte, F-35653 Le Rheu Cedex, France. 12 INRA, UR1164 Unité de Recherche en Génomique et Informatique (URGI), 523 place des terrasses de l'Agora, F-91034 Evry, France. 13 Biology Department, Woods Hole Oceanographic Institution,Co-op Building, MS #16, Woods Hole, Massachusetts 02543, USA. 14 Department of Plant Pathology North Carolina State University, 840 Method Road, Unit 4, Box 7903 Raleigh, North Carolina 27607,USA.

Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

1

Supplementary Data and Methods for: Genome sequence of the metazoan

plant-parasitic nematode Meloidogyne incognita

Pierre Abad1,2,3

, Jérôme Gouzy4, Jean-Marc Aury

5,6,7, Philippe Castagnone-Sereno

1,2,3,

Etienne G.J. Danchin1,2,3

, Emeline Deleury1,2,3

, Laetitia Perfus-Barbeoch1,2,3

, Véronique

Anthouard5,6,7

, François Artiguenave5,6,7

, Vivian C. Blok8, Marie-Cécile Caillaud

1,2,3, Pedro

M. Coutinho9, Corinne Dasilva

5,6,7, Francesca De Luca

10, Florence Deau

1,2,3, Magali

Esquibet11

, Timothé Flutre12

, Jared V. Goldstone13

, Noureddine Hamamouch14

, Tarek

Hewezi15

, Olivier Jaillon5,6,7

, Claire Jubin5,6,7

, Paola Leonetti10

, Marc Magliano1,2,3

, Tom R.

Maier15

, Gabriel V. Markov16,17

, Paul McVeigh18

, Graziano Pesole19,20

, Julie Poulain5,6,7

,

Marc Robinson-Rechavi21,22

, Erika Sallet23,24

, Béatrice Ségurens5,6,7

, Delphine Steinbach12

,

Tom Tytgat25

, Edgardo Ugarte5,6,7

, Cyril van Ghelder 1,2,3

, Pasqua Veronico10

, Thomas J.

Baum15

, Mark Blaxter26

, Teresa Bleve-Zacheo10

, Eric L. Davis14

, Jonathan J. Ewbank27

,

Bruno Favery1,2,3

, Eric Grenier11

, Bernard Henrissat9, John T. Jones

8, Vincent Laudet

16, Aaron

G. Maule18

, Hadi Quesneville12

, Marie-Noëlle Rosso1,2,3

, Thomas Schiex24

, Geert Smant25

,

Jean Weissenbach5,6,7

, Patrick Wincker5,6,7

1INRA, UMR 1301, 400 route des Chappes, F-06903 Sophia-Antipolis, France.

2CNRS,

UMR 6243, 400 route des Chappes, F-06903 Sophia-Antipolis, France. 3UNSA, UMR 1301,

400 route des Chappes, F-06903 Sophia-Antipolis, France. 4Laboratoire Interactions Plantes

Micro-organismes, UMR441/2594, INRA/CNRS, Chemin de Borde Rouge, BP 52627, F-

31320 Castanet Tolosan, France. 5Genoscope (CEA), 2 rue Gaston Crémieux, CP5706, F-

91057 Evry, France. 6CNRS, UMR 8030, 2 rue Gaston Crémieux, CP5706, F-91057 Evry,

France. 7Université d'Evry, F-91057 Evry, France.

8Plant Pathology Programme, SCRI,

Invergowrie, Dundee, DD2 5DA, UK. 9CNRS, UMR 6098 CNRS and Universites of Aix-

Marseille I & II, Case 932, 163 Av. de Luminy, F-13288 Marseille, France. 10

Istituto per la

Protezione delle Piante, Consiglio Nazionale delle Ricerche, Via G. Amendola 165/a, 70126,

Bari, Italy. 11

INRA, Agrocampus Rennes, Univ. Rennes I, UMR1099 BiO3P, Domaine de la

Motte, F-35653 Le Rheu Cedex, France. 12

INRA, UR1164 Unité de Recherche en Génomique

et Informatique (URGI), 523 place des terrasses de l'Agora, F-91034 Evry, France. 13

Biology

Department, Woods Hole Oceanographic Institution,Co-op Building, MS #16, Woods Hole,

Massachusetts 02543, USA. 14

Department of Plant Pathology North Carolina State

University, 840 Method Road, Unit 4, Box 7903 Raleigh, North Carolina 27607,USA.

Page 2: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

2

15Department of Plant Pathology, Iowa State University, 351 Bessey Hall, Ames, Iowa 50011,

USA. 16

Université de Lyon, Institut de Génomique Fonctionnelle de Lyon, Molecular

Zoology team, Ecole Normale Supérieure de Lyon, Université Lyon 1, CNRS, INRA, Institut

Fédératif 128 Biosciences Gerland, Lyon Sud, 46 allée d'Italie, F-69364 Lyon Cedex 07,

France. 17

USM 501 - Evolution des Régulations Endocriniennes, Muséum National d'Histoire

Naturelle, 7 rue Cuvier, F-75005 Paris, France. 18

Biomolecular Processes: Parasitology,

School of Biological Sciences, Medical Biology Centre, 97 Lisburn Road, Queen's University

Belfast, Belfast, BT9 7BL, UK. 19

Dipartimento di Biochimica e Biologia Molecolare “E.

Quagliariello”, University of Bari, Via Orabona 4, 70126 Bari, Italy. 20

Istituto Tecnologie

Biomediche, Consiglio Nazionale delle Ricerche, Via G. Amendola, 122/D – 70126 Bari,

Italy. 21

Department of Ecology and Evolution, University of Lausanne, UNIL-Sorge, Le

Biophore, CH - 1015 Lausanne, Switzerland. 22

Swiss Institute of Bioinformatics, quartier

sorge, Bâtiment Genopode, CH - 1015 Lausanne, Switzerland. 23

Plateforme Bioinformatique

du Genopole Toulouse Midi-Pyrénées, GIS Toulouse Genopole, 24 Chemin de Borde Rouge,

BP 52627, F-31320 Castanet Tolosan, France. 24

Unité de Biométrie et d'Intelligence

Artificielle UR875, INRA, Chemin de Borde Rouge, BP 52627, F-31320 Castanet Tolosan,

France. 25

Laboratory of Nematology, Wageningen University, Binnenhaven 5, 6709PD

Wageningen, The Netherlands. 26

Institute of Evolutionary Biology, University of Edinburgh,

Kings Buildings, Ashworth Laboratories, West Mains Road, Edinburgh EH9 3JT, UK.

27INSERM/CNRS/Université de la Méditerranée, Centre d'Immunologie de Marseille-

Luminy, 163 av. de Luminy, Case 906, F-3288, Marseille cedex 09, France. Correspondence

should be addressed to P.A. ([email protected]).

Page 3: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

3

1 Additional background information .................................................................................... 6

1.1 Why have we sequenced the Meloidogyne incognita genome? .................................. 6

1.2 Economic Rationale ..................................................................................................... 6

1.3 Biological and phytopathological traits ....................................................................... 7

1.4 Phylogenetic significance ............................................................................................ 7

1.5 Genetics of the Meloidogyne genus ............................................................................ 9

2 Genome assembly and structure ....................................................................................... 10

2.1 Assembly ................................................................................................................... 10

2.2 Detection of scaffold pairs and triplets ...................................................................... 11

3 Repetitive and non protein-coding sequences ................................................................... 14

3.1 Repeats and Transposable Elements .......................................................................... 14

3.2 Noncoding RNAs (ncRNAs) ..................................................................................... 17

3.2.1 Ribosomal RNA ................................................................................................. 17

3.2.2 tRNA .................................................................................................................. 17

3.2.3 miRNA ............................................................................................................... 18

3.3 Spliced Leader SL ..................................................................................................... 19

4 Operonic structures ........................................................................................................... 21

4.1 Background ................................................................................................................ 21

4.2 Supplementary results ................................................................................................ 22

4.3 Summary .................................................................................................................... 25

5 Protein coding gene set ..................................................................................................... 26

5.1 Supplementary results ................................................................................................ 27

5.2 Similarity pattern between predicted proteins ........................................................... 28

6 Automatic functional annotation ....................................................................................... 30

7 Expert functional annotation ............................................................................................. 35

7.1 Carbohydrate Active enZymes (CAZymes) .............................................................. 36

Page 4: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

4

7.2 Proteases .................................................................................................................... 40

7.3 Orthologs to published nematode plant-parasitism genes ......................................... 42

7.4 Pioneer genes ............................................................................................................. 44

7.5 Gene families involved in protection against environmental stresses ....................... 46

7.6 Nuclear Receptors ...................................................................................................... 49

7.7 Kinome ...................................................................................................................... 50

7.8 Chemosensory G Protein-Coupled Receptors (GPCRs) ........................................... 51

7.9 Neuropeptides ............................................................................................................ 53

7.10 Sex determination ...................................................................................................... 55

7.11 RNAi genes ................................................................................................................ 57

7.12 Orthologs of C. elegans lethal RNAi genes ............................................................... 58

8 Supplementary Methods ................................................................................................... 59

8.1 Biological material .................................................................................................... 59

8.1.1 Nematode strain and DNA preparation .............................................................. 59

8.1.2 EST resources ..................................................................................................... 59

8.2 Genome sequencing and assembly ............................................................................ 60

8.3 Detection of scaffold pairs and triplets ...................................................................... 60

8.4 Detection of repetitive elements ................................................................................ 61

8.5 Detection of non-coding RNAs ................................................................................. 62

8.6 Splice Leaders (SL) annotation and detection ........................................................... 62

8.7 Detection of Operons ................................................................................................. 62

8.8 Gene model predictions ............................................................................................. 63

8.8.1 Accuracy assessment .......................................................................................... 64

8.9 Automatic functional annotation ............................................................................... 65

8.10 Detection and annotation of CAZymes ..................................................................... 66

8.10.1 Detection of CAZymes and modular annotation ................................................ 66

Page 5: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

5

8.10.2 Functional annotation of detected CAZymes ..................................................... 67

8.10.3 Comparison of CAZyme repertoires .................................................................. 67

8.10.4 Search for homologs of M. incognita PCWD enzymes ..................................... 68

8.11 Annotation of the proteases set .................................................................................. 68

8.12 Antioxidant enzymes ................................................................................................. 69

8.13 Glutathione-S-transferases ......................................................................................... 69

8.14 Cytochromes P450 ..................................................................................................... 69

8.15 Immune response ....................................................................................................... 69

8.16 Detection and annotation of Nuclear Receptors (NRs) ............................................. 69

8.17 Annotation of Kinases ............................................................................................... 70

8.18 GPCRs ....................................................................................................................... 70

8.19 Neuropeptides ............................................................................................................ 70

8.20 Orthologs of C. elegans lethal RNAi genes ............................................................... 70

9 REFERENCES ................................................................................................................. 72

Page 6: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

6

1 Additional background information

1.1 Why have we sequenced the Meloidogyne incognita genome?

Nematodes are simple roundworms consisting of an elongated stomach and a reproductive

system inside a resistant outer cuticle. There are about 25,000 described species classified in

the phylum Nematoda. However, there may be as many as ten million more unknown species

yet to be discovered. In terms of individuals, nematodes account for an estimated four of

every five animals on Earth. Nematodes may be free-living, predaceous, or parasitic. The last

group represents a major challenge to human and animal health and agriculture. Among them,

root knot nematodes (RKN) are the most damaging species infecting almost all cultivated

plant species. To have access to all of the genes involved in these nematodes’ associated

pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-

quality initial draft using a whole-genome shotgun strategy. This is the first analysis of an

assembled genome of a plant parasitic nematode (PPN), which further represents the first

genome of a metazoan plant parasite.

1.2 Economic Rationale

Plant-parasitic nematodes are annually responsible for an estimated 100 billion euros in crop

damage worldwide. Among them, M. incognita represents the most widespread species and is

found in every country in which the lowest temperature is more than 3°C. Therefore, it has

been postulated that this species is possibly the most damaging crop pathogen in the world.

Currently, nematicides, plant resistances and cultural practices are the most important and

reliable means of controlling nematodes. However, most nematicides are non-specific,

notoriously toxic and they pose a threat to the soil ecosystem, ground water and human health.

Therefore, the use of agrochemicals is restricted and will even be more drastically reduced in

the future. For example, methyl bromide, the most commonly used fumigant nematicide, has

been definitively prohibited in Europe in 2005, due to EU regulation.

Page 7: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

7

1.3 Biological and phytopathological traits

The nematode M. incognita is an obligatory parasite which has to be cultured on host plants.

It reproduces by mitotic parthenogenesis with an average life cycle of six to eight weeks at

25°C (our laboratory conditions). After maturation, females lay eggs within a gelatinous

matrix outside the root. Second-stage juveniles (J2) hatch from eggs in the soil and invade the

root tissues towards the vascular cylinder. After three further moults, females become

piriform and sedentary and induce the formation of five to seven giant, multinucleate cells

upon which they feed. Males remain vermiform and leave the root (Main Manuscript Fig.1).

In RKN, it has been shown that the mode of reproduction is linked to the plant host range.

Indeed, parthenogenetic species have a wide host range compared to sexual species which are

more specialized1, 2

.

This obligate parasite interacts with its hosts in a quite unique and intriguing way. It induces

the redifferentiation of root cells into specialized feeding cells essential for nematode growth

and reproduction. Upon infection, motile nematode second-stage larvae (J2) present in the soil

penetrates the roots, preferably at the elongation zone just behind the root tip. They migrate

between root cells towards the vascular cylinder to select a competent root cell for the

induction of enlarged and multinucleated feeding cells. These typical nematode feeding cells,

named giant cells, contain large volumes of cytoplasm and are converted into nutrient sinks

that serve as the sole food source for the subsequent sedentary parasitic stages. They result

from synchronous repeated karyokinesis without cell division. Hyperplasia and hypertrophy

of the surrounding cells lead to the formation of a typical root gall or knot, the primary visible

symptom of infection. It is not yet understood how feeding sites are induced, but it is believed

that the pathogenicity factors secreted by the nematode play key roles during parasitism and

might have direct effects on recipient host cells.

1.4 Phylogenetic significance

M. incognita belongs to the order Tylenchida, a very large and diverse group of nematodes,

which contains a majority of the known plant parasitic species (Fig. S1). Within Tylenchida,

members of the family Heteroderidae, including the RKN (Meloidogyne spp.) and the cyst

nematodes (Globodera spp. and Heterodera spp.), are by far the most damaging to world

agriculture. To date, more than 80 RKN species are described, and M. incognita is

Page 8: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

8

unquestionably the most important one in terms of distribution and damage caused, and also

in terms of research efforts worldwide. Although representative of the genus, M. incognita, as

with the other mitotic RKN species, holds a phylogenetic position quite distant from meiotic

RKN species, e.g. M. hapla3-5

. The divergence of the parthenogenetic RKN species from the

amphimictic meiotic ones is still debated. It has been estimated to have occurred about 43

million years ago6 but might be far older

7.

Figure S1 | Phylogeny of the phylum Nematoda. Species with an ongoing or

complete genome project are highlighted.

Modified from Blaxter et al. 19988 and images from nematode.org (H. contortus, Caenorhabditis spp., and B.

malayi), Bilkent University (A. caninum, www.bilkent.edu.tr), Roxportal (A. lumbricoides, www.roxportal.com) and

McGill University (T. spiralis, www.medicine.mcgill.ca) websites.

Page 9: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

9

1.5 Genetics of the Meloidogyne genus

Meloidogyne incognita reproduces by obligatory mitotic parthenogenesis and cytogenetic

analysis has revealed the existence of isolates with chromosome numbers ranging from 32 to

36 as well as isolates with 40 to 48 chromosomes. Assuming that the haploid chromosome

number is n=18 (as observed in many meiotic sexually reproducing species), these isolates

can be considered as diploids (2n) or hypotriploids (3n-x), respectively, both with possible

chromosomal/ segmental losses leading to observed aneuploidy9. Although reproducing by

mitotic parthenogenesis, M. incognita has the capacity to easily adapt to unfavourable abiotic

or biotic environmental conditions. For example, the emergence of virulent populations that

can overcome plant resistance genes has been extensively documented10

. This potential for

rapid adaptive evolution appears paradoxical with respect to the apomictic mode of

reproduction of the nematode and raises fundamental questions about the genetic mechanisms

leading to phenotypic variability/evolution in clonal organisms challenged with heterogeneous

and changing environments11

.

Page 10: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

10

2 Genome assembly and structure

2.1 Assembly

The 1,000,873 individual reads (mean trimmed read length of 638 bases, for a total of 638.5

Mb) assembled in 2,817 supercontigs whose N50 was 82 kb (Table S1, Supplementary

Methods, section 8.2).

Table S1 | Statistics of the M. incognita genome assembly.

Categories number N50 Total size

Contigs 9,538 12.8 kb 82 Mb

Super-contigs 2,817 82.7 kb 86 Mb

We plotted the cumulated number of basepairs covered at a given fold (Fig. S2). The

homogeneity of the peak shows that most genomic regions are present at the same level. The

average coverage is about 5x, or half the theoretical estimate for the homozygous M.

incognita genome size (about 50 Mb), which is consistent with most allelic regions having

been separately assembled (discussed in section 2.2).

Figure S2 | Sequence coverage across the assembly.

The x axis indicates the redundancy of reads at a given position. The y axis indicates the number of bases at that

level of redundancy.

0 5 10 15 200

1e+06

2e+06

3e+06

4e+06

5e+06

6e+06

Page 11: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

11

2.2 Detection of scaffold pairs and triplets

We found 1,473 clusters of M. incognita homologous genes, containing 6,068 predicted genes

and spanning about 54Mb (Supplementary Methods, section 8.3). We then aligned all paired

supercontigs at the nucleotide level. The observed similarities and substitution rates (Fig. S3

and S4) confirmed the extensive shared homolog content between 648 supercontigs. This

procedure could not assign any allelic relationship between small supercontigs, as they do not

contain sufficient information to construct clusters. We estimate that most of the genome is

present as two highly diverged haplotypes, as all large supercontigs were included in the list

of 648 cases, with only two exceptions. These were checked and found to be gene-poor and

repetitive elements-rich. This could be explained by the difficulty in aligning single-copy

sequences in such regions, or by the difficulty in assembling them correctly. All other

supercontigs could not be assigned a clear allelic relative due essentially to their small sizes.

We found about 3.35 Mb of the assembly that corresponded to a third copy aligning with two

previously identified allelic supercontigs. By analysing the unassembled reads

(Supplementary Methods, section 8.3), we estimated that 2.4 Mb of haploid genome

sequence may represent a third copy of some sequence segments in addition to the 3.35 Mb

previously identified in the supercontigs. So, the total size of the genome that displays a third

allelic version may be estimated as 3.35 + 2.4 = 5.75 Mb, or about 11.5% of genome size, if

we estimate the haploid genome as being 50 Mb12

.

The pattern of evolutionary relationships between homologous copies in a genome will reflect

the accumulated signature of the history of reproductive mode. As M. incognita is a mitotic

parthenogen, the genome is currently evolving without recombination, and we would expect

former alleles to be accumulating mutations and drifting apart: the Meselson effect13, 14

. The

expected divergence of former alleles will be conditioned by the age of the asexual lineage,

and the mode of acquisition of asexuality.

The mode of acquisition of asexuality in M. incognita is unknown. Two main models are

possible. One is an endogenous acquisition of mitotic parthenogenesis: in this model

homologous chromosomes will initially contain a frozen snapshot of two of the alleles present

in the originating population, and these alleles will then drift apart. An alternative model is the

generation of asexual lineages by hybridisation between parents from different species: in this

model, the initial divergence between homologs is expected to be higher, and to include

Page 12: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

12

differential segmental rearrangements, duplications and deletions. The observed regions

corresponding to a third partial copy of the genome, could result from (1) differential

segmental duplication patterns in the hybridising parents (2) to segmental duplications

involving only one chromosome homolog following acquisition of asexuality and / or (3)

partial (hypo-) triploidy. M. incognita is presumed to have derived from an ancestor with

haploid chromosome number n=18; different isolates have chromosome numbers ranging

from 32 to 46, and thus possibly represent species varying from simple diploids to

hypotriploids9. In the absence of external reference points for the date of origin of M.

incognita and for the genomic mutation rate, it is not possible to distinguish between the two

models. Acquisition of genomic data from closely related Meloidogyne species could both

inform models of within-sexual species divergence and possibly identify them as parents of a

hybrid parthenogen. An ability to tolerate partial genome duplication (to form hypotriploids),

thus allowing exploration of adaptive space by the duplicated genes, might be one mechanism

underpinning adaptation to multiple hosts.

Figure S3 | Nucleotide similarity between allelic super-contigs.

The x axis represents the % identity (both substitution and small insertions-deletions) between pairs of allelic

super-contigs. The y axis represents the cumulated size of regions corresponding to each category.

80 85 90 95 1000

5e+05

1e+06

1.5e+06

2e+06

2.5e+06

3e+06

Page 13: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

13

Figure S4 | Substitution rates between allelic super-contigs.

The x axis indicates the substitution percentage between two allelic super-contigs (not accounting for insertion-

deletions). The y axis indicates the number of nucleotide in the assembly for each substitution value.

0 5 10 15 200

1e+06

2e+06

3e+06

4e+06

Page 14: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

14

3 Repetitive and non protein-coding sequences

3.1 Repeats and Transposable Elements

The BLASTER all-by-all comparison of the M. incognita genome (first step of the denovo

pipeline, Supplementary Methods, section 8.4) indicated that repeats cover 19% of the

genome. This is clearly an underestimate due to the high stringency of this search. At the end

of the annotation pipeline, thirty-six percent of the M. incognita genome matched consensus

sequences for repeats (Table S2).

In total, 4,041 different repeat families were detected, from which 3,066 had no visible TE

features, and 135 had contradictory characteristics. Note that some of the families with no TE

features and contradictory characteristics may correspond to satellite repeats.

Only 690 families had obvious TE features: 210 LTR retroelements, 29 LINEs-like, 13

SINEs-like, 430 TIR transposons and 8 Helitrons. Table S2 summarizes the statistics for

these categories plus the repeats such as SSRs and those without any TE features or

contradictory characteristics.

Table S2 | Summary statistics for repeats in M. incognita genome

Repeats types TE class Number of

families Number of

copies Coverage

(bp) Coverage (% of

genome)

LTR retroelement

Class I

210 2,625 2,238,615 2.37

LINE 29 381 268,357 0.31

SINE 13 148 75,296 0.09

TIR transposon Class II 430 9,725 2,897,931 3.37

Helitron 8 108 201,474 0.23

SSR 150 2,100 541,345 0.63

no TE features 3,066 82,598 22,556,643 26.21

contradictory features 135 2,263 3,019,110 3.51

Total 4,041 99,843 31,601,493 36.72

The highest number of copies for a repeat family was 583. This family consensus sequence is

628bp long with no visible TE feature. Each family had on average 24 copies with a median

of 12, indicating a skewed distribution of copy number among repeat families.

Page 15: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

15

The highest number of complete copies, 43, was for another family with a 944 bp long

consensus sequence and no obvious TE feature. These copies are considered as complete as

their length is at least 95% of the consensus. Figures S5 and S6 show the copy numbers for

different types of repeat.

Figure S5 | Distribution of copy number for different super-families of TEs.

In terms of dynamics, one hundred and eleven different families have a minimum copy-to-

family consensus identity percentage greater than 95% indicating recent families and possible

current activity. All together, they represent 334 copies and cover 0.4% of the genome (Fig.

S6).

Page 16: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

16

Figure S6 | Distribution of the mean identity between the copies and the

consensus for different super-families of TEs.

Page 17: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

17

3.2 Noncoding RNAs (ncRNAs)

3.2.1 Ribosomal RNA

In the M. incognita assembly, five 18S-5.8S-26S rDNA clusters of 5.9 kb were identified

(Supplementary Methods, section 8.5). These 5 complete copies are localized in 5 different

scaffolds and not repeated in tandem. No external transcribed spacer was identified, which

explains the smaller size compared to the 7.2kb repeat unit in C. elegans. One 28S-18S-5.8S

rDNA cluster of 7.5kb was also identified in the assembly. Almost complete sequences of

5.8S, 18S and 28S ribosomal genes were discovered in 15, 12 and 7 scaffolds respectively. In

addition to the genes located in the six rDNA clusters, 23 copies of 5.8S, 13 copies of 18S and

1 copy of the 28S ribosomal genes were identified in one of the 18 scaffolds containing at

least one complete ribosomal gene sequence.

3.2.2 tRNA

A total of 611 tRNA genes were identified by tRNAScanSE (Supplementary Methods,

section 8.5). This set was composed of 467 tRNA genes representing the whole set of amino

acids; 120 tRNA pseudogenes; 24 putative tRNAs with undetermined or unknown isotypes; 3

selenocysteine tRNA genes and 1 possible suppressor tRNA. We note a strong bias in the

representation of tRNA-Threonine gene, with 203 occurrences in the genome (Table S3).

This number represents 43% of all tRNAs, about six-fold the number of tRNA-Threonine

identified in C. elegans (34) and five-fold the number in C. briggsae (42)15

. Note that tRNA

ACC is not the most commonly used Thr codon in M. incognita transcripts, accounting for

11.1% of 19,046 ocurrences in translated ESTs16

.

Page 18: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

18

Table S3 | Numbers of predicted tRNA genes.

UUU Phe 0 UCU Ser 0 UAU Tyr 33(5) UGU Cys 2

UUC Phe 4(1) UCC Ser 0 UAC Tyr 5 UGC Cys 2

UUA Leu 1(1) UCA Ser 4(3) UAA Stop 0(3) UGA Stop Sel[3]

UUG Leu 3(1) UCG Ser 3(1) UAG Stop 0(1) UGG Trp 2

CUU Leu 8 CCU Pro 5(1) CAU His 3(6) CGU Arg 46(2)

CUC Leu 0(1) CCC Pro 0(1) CAC His 7(7) CGC Arg 0(2)

CUA Leu 1(1) CCA Pro 27(14) CAA Gln 4(1) CGA Arg 2

CUG Leu 0 CCG Pro 9(1) CAG Gln 1 CGG Arg 1(1)

AUU Ile 8(2) ACU Thr 8 AAU Asn 1 AGU Ser 0

AUC Ile 2(2) ACC Thr 187(24) AAC Asn 5 AGC Ser 8

AUA Ile 2 ACA Thr 6(12) AAA Lys 5(3) AGA Arg 1

AUG Met 6 ACG Thr 2(1) AAG Lys 2 AGG Arg 2

GUU Val 7 GCU Ala 3(1) GAU Asp 0 GGU Gly 0

GUC Val 0 GCC Ala 1 GAC Asp 8 GGC Gly 7

GUA Val 1 GCA Ala 1(4) GAA Glu 6(2) GGA Gly 6(1)

GUG Val 2(1) GCG Ala 2 GAG Glu 1 GGG Gly 0(1)

The number of occurrences of tRNA pseudogenes is shown in parentheses.

3.2.3 miRNA

174 putative micro RNA precursors were identified by the mirfold-based pipeline which

identifies occurrences of micro-RNA precursors registered in the mirBase release17

in the M.

incognita genome (Supplementary Methods, section 8.5). In order to retrieve the

phylogenetic profile of these putative miRNAs, the corresponding mature sequence of each

gene was compared to mirBase using NCBI-BLASTN, allowing up to 3 mismatches. Twelve

miRNAs were found to be specific to the phylum Nematoda (mir-46, mir-58, mir-262, mir-

258; two copies of mir-50, mir-79, mir-354, mir-261) and 61 of the candidates were

previously registered in mirBase only in the Viridiplantae.

Page 19: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

19

3.3 Spliced Leader SL

Trans-splicing is an important process in gene expression that stabilises mRNA by providing

a 5’-cap structure, refines the 5’ untranslated region of pre-mRNA and enhances translation.

Trans-splicing is the spliceosomal transfer of a short mRNA molecule, the spliced leader (SL)

to unpaired splice-acceptor sites upstream of the ATG initiation site of pre-mRNAs. SL trans-

splicing can also have a role in resolving polycistronic transcripts into individual capped

mRNAs. SL genes contain a conserved 20-24-nt SL and a more diverged intron. Trans-

splicing has been evidenced in a variety of eukaryotes, including nematodes, platyhelminths,

tunicates, cnidarians and euglenozoa18

.

In rhabditine nematodes, SL2-type spliced leaders resolve polycistronic pre-mRNA. We did

not identify SL2 in M. incognita despite the presence of operons in the genome, strengthening

the finding that SL2 are an evolutionary invention of rhabditines19

. However, the canonical

22-nt SL1 has been identified in all nematodes surveyed. In C. elegans about half of the genes

are trans-spliced by SL1 and 110 SL1 genes are organized in tandem repeats associated with

the 5S RNA20

. We have identified in the genome of M. incognita 283 Mi-SL1 genes

distributed among 46 contigs (Supplementary Methods, section 8.6). After sequence

alignment we identified two major groups of similar SL1s that corresponded to two variant

forms of SL1 with no homology in the intron (Fig. S7a). Seventeen copies of the variant Mi-

SL1a are distributed on 6 contigs. Mi-SL1b was the most abundant variant in the genome with

258 copies associated with the tandem repeat satellite DNA MelSAT B (GenBank Accession

n° L07110). The other SL1 genes, disseminated on contigs independently of 5S RNA, showed

strong conservation of the 22-nt exon sequence NDTTKRATTACCCAAGTWTRAG and

highly diverged intron sequences. Only Mi-SL1a and Mi-SL1b RNA have the RAU4-6GR Sm-

binding consensus sequence required for the spliceosome activity. Those two variants had a

predicted secondary structure similar to the structure of nematode and trypanosome SL121

with the splice donor site adjacent to the turn of the most 5' loop and the Sm-binding sequence

in a linear region of the molecule (Fig. S7b). In order to identify which SL1 variants are

predominantly used in M. incognita we screened EST sequences available in public databases

(excluding the libraries where SL1 was used for cDNA amplification). Among 12,434 ESTs

screened, we identified 140 (1.12% of ESTs) Mi-SL1a variant

GGTTTAATTACCCAAGTTTGAG and 1 (0.008% of ESTs) Mi-SL1b variant

GGTTTAATTACCCAAGTTTAAG. In addition, two other SL variants,

GGTTTAATTATCCAAGTTTGAG (60 copies; 0.48% of ESTs) and

Page 20: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

20

AGTTTAATTACCCAAGTTTGAG (40 copies; 0.32% of ESTs) were identified. Thus, Mi-

SL1a was the most frequently observed SL1 variant representing 55.8% of all SL variants

identified in M. incognita.

a

Ce SL1 GGTTTAATTACCCAAGTTTGAGGTA--AACATTGAAACTGACCCAAAGAAATTTGGCGTTAGCTATAAATTTTGGAA-CGTCTCCTCTC---GGGGAGACAAAAATACTAA

Mi-SL1b GGTTTAATTACCCAAGTTTAAGGTATGTAAATCATAACT------------------ACTTGGGAAAAATTTTGGAATTGTATT---TCGAAAGAAATACTTAAA-ATTAA

Mi-SL1a GGTTTAATTACCCAAGTTTGAGGTA--CTAATTGTAACCTGCCCTTTTAAAAGTGGCAACTTGGACAAATTTTGGAATTGCTTTGCCTTTGTGGCGAGGCTTAAA-TTTGG

Figure S7 | Comparison of the primary and secondary structure of SL1 from M.

incognita and C. elegans.

a, Alignment of SL1 sequences from C. elegans and M. incognita. The two most abundant SL1 variants in M.

incognita, Mi-SL1a and Mi-SL1b are presented. b, Secondary structure of SL1 from C. elegans and Mi-SL1a from

M. incognita. In both structures the splice donor site is adjacent to the turn of the most 5' loop and the Sm-binding

sequence is in a flexible region of the RNA.

Splice donor site Sm site

Sm site Sm site

Splice donor site

Splice donor site

b

C. elegans SL1 secondary structure M. incognita SL1 secondary structure

Page 21: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

21

4 Operonic structures

4.1 Background

The presence of operons, a feature usually associated with prokaryote genomes22

, is one

surprising feature of nematode genomes. In nematodes, operons give rise to polycistronic

transcripts that are resolved into mature mRNAs by trans-splicing to a small, non-coding

exon called the spliced leader23

. In the model C. elegans, downstream genes in operons are

most frequently trans-spliced to one of a small family of spliced leader (SL) exons, the SL2-

like SLs. This feature makes it possible to define operons by identifying the genes that yield

mRNAs with SL2-like spliced leaders at their 5’ ends24

. However, in other, non-rhabditid

nematode species, the SL2-like family is absent, and operonic genes are trans-spliced by SL1

and SL1-like SL exons19

. SL1 and SL1-like SLs are also utilised in trans-splicing to non-

operonic genes. Meloidogyne species have a family of SL1-like spliced leader genes25

.

In the absence of an operon-specific SL mark on operonic mRNAs, operons can be identified

by the close apposition of the constituent genes. In C. elegans the spacing between upstream

and downstream genes is usually less than 500 bases. However, in other species analysed, the

spacing can be much larger, up to 1 kb23

. We identified putative operons in the M. incognita

genome by searching for genes on the same strand separated by less than 1 kb, and compared

the operons thus defined with the operon sets from the other sequenced nematode genomes.

In C. elegans, C. briggsae and B. malayi approximately 20% of the protein coding genes are

organised in ~800 operons of 2 to 8 genes24, 26-28

. Over 95% of the operons in C. elegans are

conserved in C. briggsae26

, but only 20% of B. malayi operons have orthologous structures in

C. elegans28

. Thus operons are a dynamic feature of nematode genome organisation19

, and

investigation of the operon content of M. incognita will assist in parameterising models of

how operons arise, what maintains genes in operons, and how operons are broken up.

Page 22: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

22

4.2 Supplementary results

Identification of putative operons in M. incognita

Operonic genes in C. elegans have short intergenic spacings, and form a distinct class of gene

pairs. In contrast, the frequency distribution of intergenic spacing of M. incognita genes

shows a monotonic (log-linear) decline with increasing spacing. There was no obvious step

change in gene spacing frequency at shorter distances (less than 1,000 bases), and thus these

data did not offer an independent method of identification of putatively operonic genes.

Another marker of operonic genes in C. elegans is the presence of a trans-splice acceptor site

close upstream of the putative start site of the open reading frame. The 300 bases upstream of

each putatively operonic gene (and an equal number of non operonic genes) was scanned for

matches to the nematode trans-splice acceptor site consensus (TTT[T|C]AG). The distribution

of trans-splice acceptor sites within the 300 bases upstream did not differ substantially

between gene classes (putatively operonic, internal to putative operons and non-operonic; Fig.

S8), but notably putatively operonic genes had more than twice as many genes with trans-

splice acceptor sites within 300 bases than did non-operonic genes (~50% vs. ~21%,

respectively).

Figure S8 | Distribution of trans-splice sites within 300 bases upstream of M.

incognita genes.

Page 23: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

23

Putative operons were therefore predicted based on a minimum spacing of 25 base pairs and a

maximum of 1,000 base pairs (Supplementary Methods, section 8.7). A total of 1,585

putative operons, containing 3,966 genes (20% of the total number of predicted genes), were

identified (Fig. S9). As would be expected from the pattern of gene spacing, short (two-genes)

operons are most common, and there is a log-linear decline in the number of putative operons

with increasing numbers of members. This is similar to the observations in C. elegans and B.

malayi. The longest operons (MIOP01539 and MIOP00618) contain ten genes each (Table S4

- separate file), are not diverged ancient allelic copies and have no clear functional linkage.

Figure S9 | Operon sizes in sequenced nematode genomes.

Comparisons with C. elegans and B. malayi

C. elegans is predicted to contain 1,118 operons, containing 2,867 genes (14% of the total

number of genes). B. malayi is predicted to have 926 operons, containing 2,012 genes (18% of

the total). The apparent excess of putatively operonic genes in M. incognita (20% of the

predicted genes) is likely to derive in part from the many gene pairs with short spacing (2,617,

Page 24: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

24

of which 566 were contained within operons). Counting these pairs as single genes would

reduce the proportion to ~17% (still substantially higher than C. elegans). Thus the two

pathogenic nematodes appear to have a greater proportion of their predicted transcriptome in

putative operonic structures. It should be noted here that the difference in proportion of genes

in operons between C. elegans and the two pathogenic nematodes should be taken with

caution at this point. Indeed, prediction of operons in M. incognita and B. malayi were only

based on gene spacing while in C. elegans additional criteria such as presence of trans-splice

acceptor were required. Thus a proportion of false positive may alternatively explain this

difference. It is notable that B. malayi has a relative lack of longer operons. This is most likely

a result of the draft nature of the genome sequence.

Over 14,500 M. incognita genes are placed in OrthoMCL groups with B. malayi genes, and of

these 3,083 (21%) are in operons. A total of 8,098 M. incognita genes have C. elegans

orthologs, 1,879 (23%) of which are in operons. Thus conserved genes are slightly more

likely to be operonic than those with no orthologs in these nematodes. This finding is

congruent with the observation that highly conserved genes are more likely to be in C.

elegans operons. However, these shared genes are not in orthologous operons. Only three,

two-gene operons were conserved between M. incognita and B. malayi while nine two-gene

operons were conserved between C. elegans and M. incognita. One operon was found to be

conserved between all three species: MIOP00390, containing the same ortholog pair as do

BMOP00276 and CEOP3228 (Table S5).

Some M. incognita operons were similar to C. elegans operons but either had additional

predicted members (e.g. MIOP00669 has three members, but the C. elegans orthologs of only

the first two genes are in an operon, CEOP3456), or fewer members than the matching C.

elegans operon (e.g MIOP00687 has two members, while their C. elegans orthologs are in

operon CEOP3740, which has six members). Some operons contained several genes with B.

malayi or C. elegans orthologs but no shared operonic structure. For example, MIOP00862

contains eight genes, six of which have B. malayi orthologs. However, only one of the B.

malayi orthologs is in an operon.

Page 25: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

Operons defined in release 1 of the Meloidogyne incognita genome sequence. Operon name Gene 1 Gene 2 Gene 3 Gene 4 Gene 5 Gene 6 Gene 7 Gene 8 Gene 9 Gene 10 SizeMIOP00618 Minc07758 Minc07759 Minc07760 Minc07761 Minc07762 Minc07763 Minc07764 Minc07765 Minc07766 Minc07767 10MIOP01539 Minc15145 Minc15146 Minc15147 Minc15148 Minc15149 Minc15150 Minc15151 Minc15152 Minc15153 Minc15154 10MIOP00240 Minc05258 Minc05259 Minc05260 Minc05261 Minc05262 Minc05263 Minc05264 Minc05265 Minc05266 9MIOP01370 Minc03563 Minc03564 Minc03565 Minc03566 Minc03567 Minc03568 Minc03569 Minc03570 Minc03571 9MIOP00074 Minc15987 Minc15988 Minc15989 Minc15990 Minc15991 Minc15992 Minc15993 Minc15994 8MIOP00149 Minc04916 Minc04917 Minc04918 Minc04919 Minc04920 Minc04921 Minc04922 Minc04923 8MIOP00366 Minc05889 Minc05890 Minc05891 Minc05892 Minc05893 Minc05894 Minc05895 Minc05896 8MIOP00862 Minc09981 Minc09982 Minc09983 Minc09984 Minc09985 Minc09986 Minc09987 Minc09988 8MIOP00884 Minc10224 Minc10225 Minc10226 Minc10227 Minc10228 Minc10229 Minc10230 Minc10231 8MIOP01521 Minc15015 Minc15016 Minc15017 Minc15018 Minc15019 Minc15020 Minc15021 Minc15022 8MIOP00168 Minc04970 Minc04971 Minc04972 Minc04973 Minc04974 Minc04975 Minc04976 7MIOP00545 Minc07132 Minc07133 Minc07134 Minc07135 Minc07136 Minc07137 Minc07138 7MIOP00694 Minc08424 Minc08425 Minc08426 Minc08427 Minc08428 Minc08429 Minc08430 7MIOP00881 Minc10192 Minc10193 Minc10194 Minc10195 Minc10196 Minc10197 Minc10198 7MIOP00912 Minc10412 Minc10413 Minc10414 Minc10415 Minc10416 Minc10417 Minc10418 7MIOP01374 Minc03602 Minc03603 Minc03604 Minc03605 Minc03606 Minc03607 Minc03608 7MIOP01385 Minc03636 Minc03637 Minc03638 Minc03639 Minc03640 Minc03641 Minc03642 7MIOP00029 Minc04410 Minc04411 Minc04412 Minc04413 Minc04414 Minc04415 6MIOP00421 Minc06227 Minc06228 Minc06229 Minc06230 Minc06231 Minc06232 6MIOP00582 Minc07499 Minc07500 Minc07501 Minc07502 Minc07503 Minc07504 6MIOP00784 Minc09318 Minc09319 Minc09320 Minc09321 Minc09322 Minc09323 6MIOP00863 Minc10000 Minc10001 Minc10002 Minc10003 Minc10004 Minc10005 6MIOP00871 Minc10075 Minc10076 Minc10077 Minc10078 Minc10079 Minc10080 6MIOP00876 Minc10153 Minc10154 Minc10155 Minc10156 Minc10157 Minc10158 6MIOP01100 Minc11995 Minc11996 Minc11997 Minc11998 Minc11999 Minc12000 6MIOP01158 Minc12458 Minc12459 Minc12460 Minc12461 Minc12462 Minc12463 6MIOP01313 Minc00567 Minc00568 Minc00569 Minc00570 Minc00571 Minc00572 6MIOP01356 Minc03522 Minc03523 Minc03524 Minc03525 Minc03526 Minc03527 6MIOP01419 Minc14318 Minc14319 Minc14320 Minc14321 Minc14322 Minc14323 6MIOP01452 Minc03823 Minc03824 Minc03825 Minc03826 Minc03827 Minc03828 6MIOP00033 Minc15692 Minc15693 Minc15694 Minc15695 Minc15696 5MIOP00035 Minc04428 Minc04429 Minc04430 Minc04431 Minc04432 5

Page 26: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00054 Minc15850 Minc15851 Minc15852 Minc15853 Minc15854 5MIOP00076 Minc04606 Minc04607 Minc04608 Minc04609 Minc04610 5MIOP00387 Minc17852 Minc17853 Minc17854 Minc17855 Minc17856 5MIOP00394 Minc01112 Minc01113 Minc01114 Minc01115 Minc01116 5MIOP00458 Minc01206 Minc01207 Minc01209 Minc01210 Minc01211 5MIOP00551 Minc01360 Minc01361 Minc01362 Minc01363 Minc01364 5MIOP00552 Minc01374 Minc01375 Minc01376 Minc01377 Minc01378 5MIOP00621 Minc07784 Minc07785 Minc07786 Minc07787 Minc07788 5MIOP00688 Minc01668 Minc01669 Minc01670 Minc01671 Minc01672 5MIOP00708 Minc08607 Minc08608 Minc08609 Minc08610 Minc08611 5MIOP00745 Minc08935 Minc08936 Minc08937 Minc08938 Minc08939 5MIOP00763 Minc01855 Minc01856 Minc01857 Minc01858 Minc01859 5MIOP00781 Minc09271 Minc09272 Minc09273 Minc09274 Minc09275 5MIOP00782 Minc09277 Minc09278 Minc09279 Minc09280 Minc09281 5MIOP00785 Minc09325 Minc09326 Minc09327 Minc09328 Minc09329 5MIOP00795 Minc01943 Minc01944 Minc01945 Minc01946 Minc01947 5MIOP00798 Minc09406 Minc09407 Minc09408 Minc09409 Minc09410 5MIOP00901 Minc02167 Minc02168 Minc02169 Minc02170 Minc02171 5MIOP00930 Minc02215 Minc02216 Minc02217 Minc02218 Minc02219 5MIOP00939 Minc10559 Minc10560 Minc10561 Minc10562 Minc10563 5MIOP00960 Minc02326 Minc02327 Minc02328 Minc02329 Minc02330 5MIOP01008 Minc11155 Minc11156 Minc11157 Minc11158 Minc11159 5MIOP01019 Minc11279 Minc11280 Minc11281 Minc11282 Minc11283 5MIOP01054 Minc11611 Minc11612 Minc11613 Minc11614 Minc11615 5MIOP01116 Minc12084 Minc12085 Minc12086 Minc12087 Minc12088 5MIOP01129 Minc12230 Minc12231 Minc12232 Minc12233 Minc12234 5MIOP01174 Minc02961 Minc02962 Minc02963 Minc02964 Minc02965 5MIOP01187 Minc12641 Minc12642 Minc12643 Minc12644 Minc12645 5MIOP01191 Minc03000 Minc03001 Minc03002 Minc03003 Minc03004 5MIOP01201 Minc12741 Minc12742 Minc12743 Minc12744 Minc12745 5MIOP01228 Minc12880 Minc12881 Minc12882 Minc12883 Minc12884 5MIOP01306 Minc00521 Minc00522 Minc00523 Minc00524 Minc00525 5MIOP01314 Minc00579 Minc00580 Minc00581 Minc00582 Minc00583 5MIOP01351 Minc13856 Minc13857 Minc13858 Minc13859 Minc13860 5

Page 27: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP01388 Minc14078 Minc14079 Minc14080 Minc14081 Minc14082 5MIOP01399 Minc14165 Minc14166 Minc14167 Minc14168 Minc14169 5MIOP01431 Minc14394 Minc14395 Minc14396 Minc14397 Minc14398 5MIOP01442 Minc14481 Minc14482 Minc14483 Minc14484 Minc14485 5MIOP01520 Minc04059 Minc04060 Minc04061 Minc04062 Minc04063 5MIOP01579 Minc15527 Minc15528 Minc15529 Minc15530 Minc15531 5MIOP00015 Minc00738 Minc00739 Minc00740 Minc00741 4MIOP00016 Minc00746 Minc00747 Minc00748 Minc00749 4MIOP00024 Minc15620 Minc15621 Minc15622 Minc15623 4MIOP00027 Minc15650 Minc15651 Minc15652 Minc15653 4MIOP00046 Minc15779 Minc15780 Minc15781 Minc15782 4MIOP00067 Minc15961 Minc15962 Minc15963 Minc15964 4MIOP00096 Minc16134 Minc16135 Minc16136 Minc16137 4MIOP00106 Minc16197 Minc16198 Minc16199 Minc16200 4MIOP00136 Minc04854 Minc04855 Minc04856 Minc04857 4MIOP00152 Minc16477 Minc16478 Minc16479 Minc16480 4MIOP00164 Minc00863 Minc00865 Minc00866 Minc00867 4MIOP00169 Minc04986 Minc04987 Minc04988 Minc04989 4MIOP00180 Minc05017 Minc05018 Minc05019 Minc05020 4MIOP00181 Minc05025 Minc05026 Minc05027 Minc05028 4MIOP00214 Minc05169 Minc05170 Minc05171 Minc05172 4MIOP00229 Minc00904 Minc00905 Minc00906 Minc00907 4MIOP00241 Minc16989 Minc16990 Minc16991 Minc16992 4MIOP00250 Minc05301 Minc05302 Minc05303 Minc05304 4MIOP00287 Minc17301 Minc17302 Minc17303 Minc17305 4MIOP00289 Minc05502 Minc05503 Minc05504 Minc05505 4MIOP00295 Minc01007 Minc01008 Minc01009 Minc01010 4MIOP00324 Minc05640 Minc05641 Minc05642 Minc05643 4MIOP00351 Minc05791 Minc05792 Minc05793 Minc05794 4MIOP00376 Minc17802 Minc17803 Minc17804 Minc17805 4MIOP00381 Minc05980 Minc05981 Minc05982 Minc05983 4MIOP00386 Minc17846 Minc17847 Minc17848 Minc17849 4MIOP00410 Minc06122 Minc06123 Minc06124 Minc06125 4MIOP00417 Minc06190 Minc06191 Minc06192 Minc06193 4

Page 28: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00428 Minc01155 Minc01156 Minc01157 Minc01158 4MIOP00440 Minc06317 Minc06318 Minc06319 Minc06320 4MIOP00465 Minc06527 Minc06528 Minc06529 Minc06530 4MIOP00481 Minc06694 Minc06695 Minc06696 Minc06697 4MIOP00484 Minc06713 Minc06714 Minc06715 Minc06716 4MIOP00511 Minc06872 Minc06873 Minc06874 Minc06875 4MIOP00529 Minc06982 Minc06983 Minc06984 Minc06985 4MIOP00535 Minc07014 Minc07015 Minc07016 Minc07017 4MIOP00570 Minc07376 Minc07377 Minc07378 Minc07379 4MIOP00573 Minc07401 Minc07402 Minc07403 Minc07404 4MIOP00585 Minc07553 Minc07554 Minc07555 Minc07556 4MIOP00588 Minc07574 Minc07575 Minc07576 Minc07577 4MIOP00610 Minc07707 Minc07708 Minc07709 Minc07710 4MIOP00640 Minc01535 Minc01536 Minc01537 Minc01538 4MIOP00652 Minc08039 Minc08040 Minc08041 Minc08042 4MIOP00668 Minc08200 Minc08201 Minc08202 Minc08203 4MIOP00682 Minc08348 Minc08349 Minc08350 Minc08351 4MIOP00701 Minc08549 Minc08550 Minc08551 Minc08552 4MIOP00724 Minc08798 Minc08799 Minc08800 Minc08801 4MIOP00736 Minc01810 Minc01811 Minc01812 Minc01813 4MIOP00749 Minc08959 Minc08960 Minc08961 Minc08962 4MIOP00755 Minc09012 Minc09013 Minc09014 Minc09015 4MIOP00766 Minc09117 Minc09118 Minc09119 Minc09120 4MIOP00788 Minc09352 Minc09353 Minc09354 Minc09355 4MIOP00808 Minc09519 Minc09520 Minc09521 Minc09522 4MIOP00812 Minc09543 Minc09544 Minc09545 Minc09546 4MIOP00818 Minc09600 Minc09601 Minc09602 Minc09603 4MIOP00826 Minc09659 Minc09660 Minc09661 Minc09662 4MIOP00843 Minc09806 Minc09807 Minc09808 Minc09809 4MIOP00867 Minc10054 Minc10055 Minc10056 Minc10057 4MIOP00882 Minc10206 Minc10207 Minc10208 Minc10209 4MIOP00883 Minc10215 Minc10216 Minc10217 Minc10218 4MIOP00886 Minc10243 Minc10244 Minc10245 Minc10246 4MIOP00894 Minc10293 Minc10294 Minc10295 Minc10296 4

Page 29: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00907 Minc10351 Minc10352 Minc10353 Minc10354 4MIOP00987 Minc02370 Minc02371 Minc02372 Minc02373 4MIOP00988 Minc02380 Minc02381 Minc02382 Minc02384 4MIOP00994 Minc10986 Minc10987 Minc10988 Minc10989 4MIOP01002 Minc11092 Minc11093 Minc11094 Minc11095 4MIOP01010 Minc11164 Minc11165 Minc11166 Minc11167 4MIOP01011 Minc11172 Minc11173 Minc11174 Minc11175 4MIOP01045 Minc11572 Minc11573 Minc11574 Minc11575 4MIOP01077 Minc02663 Minc02664 Minc02665 Minc02666 4MIOP01105 Minc02736 Minc02737 Minc02738 Minc02739 4MIOP01152 Minc12397 Minc12398 Minc12399 Minc12400 4MIOP01153 Minc12403 Minc12404 Minc12405 Minc12406 4MIOP01160 Minc02885 Minc02886 Minc02887 Minc02888 4MIOP01161 Minc02899 Minc02900 Minc02901 Minc02902 4MIOP01175 Minc02971 Minc02972 Minc02973 Minc02974 4MIOP01179 Minc12566 Minc12567 Minc12568 Minc12569 4MIOP01188 Minc02984 Minc02985 Minc02986 Minc02987 4MIOP01192 Minc03001 Minc03002 Minc03003 Minc03004 4MIOP01196 Minc12691 Minc12692 Minc12693 Minc12694 4MIOP01207 Minc03033 Minc03034 Minc03035 Minc03036 4MIOP01218 Minc12791 Minc12792 Minc12793 Minc12794 4MIOP01239 Minc12943 Minc12944 Minc12945 Minc12946 4MIOP01249 Minc13059 Minc13060 Minc13061 Minc13062 4MIOP01250 Minc13083 Minc13084 Minc13085 Minc13086 4MIOP01251 Minc13098 Minc13099 Minc13100 Minc13101 4MIOP01252 Minc13105 Minc13106 Minc13107 Minc13108 4MIOP01253 Minc13110 Minc13111 Minc13112 Minc13113 4MIOP01267 Minc13249 Minc13250 Minc13251 Minc13252 4MIOP01293 Minc13457 Minc13458 Minc13459 Minc13460 4MIOP01326 Minc03430 Minc03431 Minc03432 Minc03433 4MIOP01354 Minc13891 Minc13892 Minc13893 Minc13894 4MIOP01367 Minc13964 Minc13965 Minc13966 Minc13967 4MIOP01407 Minc14219 Minc14220 Minc14221 Minc14222 4MIOP01421 Minc00618 Minc00619 Minc00620 Minc00621 4

Page 30: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP01445 Minc14512 Minc14513 Minc14514 Minc14515 4MIOP01454 Minc03845 Minc03846 Minc03847 Minc03848 4MIOP01463 Minc14654 Minc14655 Minc14656 Minc14657 4MIOP01500 Minc03994 Minc03995 Minc03996 Minc03997 4MIOP01517 Minc04034 Minc04035 Minc04036 Minc04037 4MIOP01534 Minc04115 Minc04116 Minc04117 Minc04119 4MIOP01552 Minc15258 Minc15259 Minc15260 Minc15261 4MIOP01566 Minc15384 Minc15385 Minc15386 Minc15387 4MIOP01569 Minc15412 Minc15413 Minc15414 Minc15415 4MIOP01577 Minc15492 Minc15493 Minc15494 Minc15495 4MIOP00002 Minc00016 Minc00017 Minc00018 3MIOP00004 Minc00033 Minc00034 Minc00035 3MIOP00009 Minc00127 Minc00128 Minc00129 3MIOP00010 Minc00144 Minc00145 Minc00146 3MIOP00013 Minc00727 Minc00728 Minc00729 3MIOP00021 Minc04363 Minc04364 Minc04365 3MIOP00023 Minc15611 Minc15612 Minc15613 3MIOP00028 Minc04402 Minc04403 Minc04404 3MIOP00034 Minc15694 Minc15695 Minc15696 3MIOP00044 Minc04471 Minc04472 Minc04473 3MIOP00053 Minc15840 Minc15841 Minc15842 3MIOP00056 Minc15874 Minc15875 Minc15876 3MIOP00057 Minc15886 Minc15887 Minc15888 3MIOP00069 Minc15974 Minc15975 Minc15976 3MIOP00073 Minc04594 Minc04595 Minc04596 3MIOP00075 Minc16016 Minc16017 Minc16018 3MIOP00077 Minc04615 Minc04616 Minc04617 3MIOP00078 Minc16028 Minc16029 Minc16030 3MIOP00093 Minc16124 Minc16125 Minc16126 3MIOP00098 Minc16162 Minc16163 Minc16164 3MIOP00103 Minc04723 Minc04724 Minc04725 3MIOP00111 Minc04754 Minc04755 Minc04757 3MIOP00114 Minc16239 Minc16240 Minc16241 3MIOP00119 Minc16277 Minc16278 Minc16279 3

Page 31: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00120 Minc04793 Minc04794 Minc04795 3MIOP00121 Minc04799 Minc04800 Minc04801 3MIOP00129 Minc04840 Minc04841 Minc04842 3MIOP00130 Minc16351 Minc16352 Minc16353 3MIOP00142 Minc04886 Minc04887 Minc04888 3MIOP00143 Minc04897 Minc04898 Minc04899 3MIOP00145 Minc16454 Minc16455 Minc16456 3MIOP00150 Minc04924 Minc04925 Minc04926 3MIOP00157 Minc04952 Minc04953 Minc04954 3MIOP00160 Minc16545 Minc16546 Minc16547 3MIOP00167 Minc04966 Minc04967 Minc04968 3MIOP00171 Minc04996 Minc04997 Minc04998 3MIOP00175 Minc16597 Minc16598 Minc16599 3MIOP00177 Minc05001 Minc05002 Minc05003 3MIOP00182 Minc16608 Minc16609 Minc16610 3MIOP00184 Minc16639 Minc16640 Minc16641 3MIOP00190 Minc05064 Minc05065 Minc05066 3MIOP00191 Minc05067 Minc05068 Minc05069 3MIOP00194 Minc16711 Minc16712 Minc16713 3MIOP00197 Minc05098 Minc05099 Minc05100 3MIOP00200 Minc16758 Minc16759 Minc16760 3MIOP00201 Minc16769 Minc16770 Minc16771 3MIOP00204 Minc16813 Minc16814 Minc16815 3MIOP00219 Minc05188 Minc05189 Minc05190 3MIOP00221 Minc05206 Minc05207 Minc05208 3MIOP00224 Minc05223 Minc05224 Minc05225 3MIOP00233 Minc00920 Minc00921 Minc00922 3MIOP00237 Minc00963 Minc00964 Minc00965 3MIOP00238 Minc05251 Minc05252 Minc05253 3MIOP00239 Minc05254 Minc05255 Minc05256 3MIOP00243 Minc17000 Minc17001 Minc17002 3MIOP00262 Minc17138 Minc17139 Minc17140 3MIOP00266 Minc17160 Minc17161 Minc17162 3MIOP00273 Minc05414 Minc05415 Minc05416 3

Page 32: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00274 Minc05419 Minc05420 Minc05421 3MIOP00275 Minc17208 Minc17210 Minc17211 3MIOP00279 Minc05426 Minc05427 Minc05428 3MIOP00280 Minc05443 Minc05444 Minc05445 3MIOP00282 Minc05454 Minc05455 Minc05456 3MIOP00283 Minc05466 Minc05467 Minc05468 3MIOP00296 Minc01011 Minc01012 Minc01013 3MIOP00297 Minc01015 Minc01016 Minc01017 3MIOP00298 Minc01019 Minc01020 Minc01021 3MIOP00299 Minc05520 Minc05521 Minc05522 3MIOP00308 Minc17411 Minc17412 Minc17413 3MIOP00310 Minc05564 Minc05565 Minc05566 3MIOP00316 Minc17475 Minc17476 Minc17477 3MIOP00319 Minc05617 Minc05618 Minc05619 3MIOP00322 Minc17505 Minc17506 Minc17507 3MIOP00328 Minc17530 Minc17531 Minc17532 3MIOP00335 Minc05707 Minc05708 Minc05709 3MIOP00341 Minc17626 Minc17627 Minc17628 3MIOP00352 Minc05809 Minc05810 Minc05811 3MIOP00358 Minc05832 Minc05833 Minc05834 3MIOP00365 Minc05881 Minc05882 Minc05883 3MIOP00374 Minc05950 Minc05951 Minc05952 3MIOP00379 Minc05968 Minc05969 Minc05970 3MIOP00384 Minc05996 Minc05997 Minc05998 3MIOP00392 Minc01100 Minc01101 Minc01102 3MIOP00396 Minc06015 Minc06016 Minc06017 3MIOP00401 Minc06047 Minc06048 Minc06049 3MIOP00404 Minc06076 Minc06077 Minc06078 3MIOP00411 Minc06129 Minc06130 Minc06131 3MIOP00412 Minc06133 Minc06134 Minc06135 3MIOP00413 Minc06139 Minc06140 Minc06141 3MIOP00414 Minc06172 Minc06173 Minc06174 3MIOP00416 Minc06184 Minc06185 Minc06186 3MIOP00429 Minc01161 Minc01162 Minc01163 3

Page 33: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00443 Minc06338 Minc06339 Minc06340 3MIOP00445 Minc06353 Minc06354 Minc06355 3MIOP00454 Minc06449 Minc06450 Minc06451 3MIOP00457 Minc01199 Minc01200 Minc01202 3MIOP00462 Minc06501 Minc06502 Minc06503 3MIOP00475 Minc06633 Minc06634 Minc06635 3MIOP00480 Minc06691 Minc06692 Minc06693 3MIOP00490 Minc01259 Minc01260 Minc01261 3MIOP00496 Minc06751 Minc06752 Minc06753 3MIOP00498 Minc06760 Minc06761 Minc06762 3MIOP00501 Minc06798 Minc06799 Minc06800 3MIOP00514 Minc06935 Minc06936 Minc06937 3MIOP00520 Minc01302 Minc01303 Minc01304 3MIOP00524 Minc06952 Minc06953 Minc06954 3MIOP00525 Minc06960 Minc06961 Minc06962 3MIOP00531 Minc06993 Minc06994 Minc06995 3MIOP00536 Minc07042 Minc07043 Minc07044 3MIOP00544 Minc07100 Minc07101 Minc07102 3MIOP00554 Minc07219 Minc07220 Minc07221 3MIOP00555 Minc07229 Minc07230 Minc07231 3MIOP00557 Minc07254 Minc07255 Minc07256 3MIOP00559 Minc07273 Minc07274 Minc07275 3MIOP00562 Minc18656 Minc18657 Minc18658 3MIOP00563 Minc07301 Minc07302 Minc07303 3MIOP00566 Minc07340 Minc07341 Minc07342 3MIOP00569 Minc01435 Minc01436 Minc01437 3MIOP00578 Minc07450 Minc07451 Minc07452 3MIOP00580 Minc07484 Minc07485 Minc07486 3MIOP00583 Minc07510 Minc07511 Minc07512 3MIOP00590 Minc01442 Minc01443 Minc01444 3MIOP00592 Minc01459 Minc01460 Minc01461 3MIOP00593 Minc07583 Minc07584 Minc07585 3MIOP00594 Minc07586 Minc07587 Minc07588 3MIOP00596 Minc07603 Minc07604 Minc07605 3

Page 34: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00597 Minc07624 Minc07625 Minc07626 3MIOP00601 Minc07648 Minc07649 Minc07650 3MIOP00611 Minc07731 Minc07732 Minc07733 3MIOP00612 Minc07736 Minc07737 Minc07738 3MIOP00613 Minc07740 Minc07741 Minc07742 3MIOP00617 Minc07752 Minc07753 Minc07754 3MIOP00620 Minc07776 Minc07778 Minc07779 3MIOP00626 Minc07834 Minc07835 Minc07836 3MIOP00627 Minc07842 Minc07843 Minc07844 3MIOP00629 Minc07858 Minc07859 Minc07860 3MIOP00645 Minc07990 Minc07991 Minc07992 3MIOP00649 Minc08015 Minc08016 Minc08017 3MIOP00650 Minc08025 Minc08026 Minc08027 3MIOP00654 Minc08053 Minc08054 Minc08055 3MIOP00661 Minc08146 Minc08147 Minc08148 3MIOP00662 Minc08160 Minc08161 Minc08162 3MIOP00669 Minc08205 Minc08206 Minc08207 3MIOP00677 Minc08313 Minc08314 Minc08315 3MIOP00681 Minc08338 Minc08339 Minc08340 3MIOP00685 Minc01652 Minc01653 Minc01654 3MIOP00689 Minc08365 Minc08366 Minc08367 3MIOP00693 Minc08406 Minc08407 Minc08408 3MIOP00703 Minc08565 Minc08566 Minc08567 3MIOP00704 Minc08580 Minc08581 Minc08582 3MIOP00706 Minc08591 Minc08592 Minc08593 3MIOP00710 Minc08644 Minc08645 Minc08646 3MIOP00711 Minc08648 Minc08649 Minc08650 3MIOP00725 Minc08807 Minc08808 Minc08810 3MIOP00726 Minc08816 Minc08817 Minc08818 3MIOP00734 Minc01788 Minc01789 Minc01790 3MIOP00742 Minc08915 Minc08916 Minc08917 3MIOP00748 Minc08953 Minc08954 Minc08955 3MIOP00761 Minc01843 Minc01844 Minc01845 3MIOP00770 Minc09134 Minc09135 Minc09136 3

Page 35: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00774 Minc01909 Minc01910 Minc01911 3MIOP00776 Minc01915 Minc01916 Minc01917 3MIOP00787 Minc09337 Minc09338 Minc09339 3MIOP00789 Minc09377 Minc09378 Minc09379 3MIOP00790 Minc09391 Minc09392 Minc09393 3MIOP00794 Minc01935 Minc01936 Minc01937 3MIOP00797 Minc09396 Minc09397 Minc09398 3MIOP00801 Minc09450 Minc09451 Minc09452 3MIOP00803 Minc09468 Minc09469 Minc09470 3MIOP00804 Minc09492 Minc09493 Minc09494 3MIOP00816 Minc09579 Minc09580 Minc09581 3MIOP00819 Minc09607 Minc09608 Minc09609 3MIOP00820 Minc09622 Minc09623 Minc09624 3MIOP00825 Minc09644 Minc09645 Minc09646 3MIOP00829 Minc09688 Minc09689 Minc09690 3MIOP00839 Minc09778 Minc09779 Minc09780 3MIOP00844 Minc09811 Minc09812 Minc09813 3MIOP00850 Minc09850 Minc09851 Minc09852 3MIOP00853 Minc09870 Minc09871 Minc09872 3MIOP00858 Minc09915 Minc09917 Minc09919 3MIOP00875 Minc10149 Minc10150 Minc10151 3MIOP00888 Minc10255 Minc10256 Minc10257 3MIOP00889 Minc10270 Minc10271 Minc10272 3MIOP00891 Minc10278 Minc10279 Minc10280 3MIOP00897 Minc10327 Minc10328 Minc10329 3MIOP00898 Minc10334 Minc10335 Minc10336 3MIOP00909 Minc10386 Minc10387 Minc10388 3MIOP00910 Minc10390 Minc10391 Minc10392 3MIOP00916 Minc10453 Minc10454 Minc10455 3MIOP00920 Minc00322 Minc00323 Minc00324 3MIOP00932 Minc02231 Minc02232 Minc02233 3MIOP00933 Minc02244 Minc02245 Minc02246 3MIOP00934 Minc10495 Minc10496 Minc10497 3MIOP00935 Minc10498 Minc10499 Minc10500 3

Page 36: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00937 Minc10514 Minc10515 Minc10516 3MIOP00938 Minc10518 Minc10519 Minc10520 3MIOP00944 Minc02259 Minc02260 Minc02261 3MIOP00948 Minc10661 Minc10662 Minc10663 3MIOP00951 Minc10693 Minc10694 Minc10695 3MIOP00954 Minc10705 Minc10706 Minc10707 3MIOP00956 Minc10728 Minc10729 Minc10730 3MIOP00969 Minc10836 Minc10837 Minc10838 3MIOP00974 Minc10861 Minc10862 Minc10863 3MIOP00975 Minc10884 Minc10885 Minc10886 3MIOP00996 Minc11017 Minc11018 Minc11019 3MIOP01000 Minc11072 Minc11073 Minc11074 3MIOP01022 Minc11294 Minc11295 Minc11296 3MIOP01023 Minc11327 Minc11328 Minc11329 3MIOP01024 Minc11334 Minc11335 Minc11336 3MIOP01026 Minc02506 Minc02507 Minc02508 3MIOP01030 Minc11420 Minc11421 Minc11422 3MIOP01035 Minc11452 Minc11453 Minc11454 3MIOP01036 Minc11463 Minc11464 Minc11465 3MIOP01038 Minc11472 Minc11473 Minc11474 3MIOP01039 Minc02534 Minc02535 Minc02536 3MIOP01040 Minc11520 Minc11521 Minc11522 3MIOP01044 Minc11567 Minc11568 Minc11569 3MIOP01048 Minc02584 Minc02585 Minc02586 3MIOP01051 Minc11587 Minc11588 Minc11589 3MIOP01052 Minc11593 Minc11594 Minc11595 3MIOP01056 Minc11644 Minc11645 Minc11646 3MIOP01061 Minc00405 Minc00406 Minc00407 3MIOP01064 Minc00425 Minc00426 Minc00427 3MIOP01066 Minc11698 Minc11699 Minc11700 3MIOP01069 Minc11760 Minc11761 Minc11762 3MIOP01070 Minc11766 Minc11767 Minc11768 3MIOP01075 Minc02647 Minc02648 Minc02649 3MIOP01081 Minc11810 Minc11811 Minc11812 3

Page 37: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP01092 Minc02689 Minc02690 Minc02691 3MIOP01094 Minc11936 Minc11937 Minc11938 3MIOP01096 Minc11944 Minc11945 Minc11946 3MIOP01099 Minc11989 Minc11990 Minc11991 3MIOP01101 Minc12022 Minc12023 Minc12024 3MIOP01107 Minc02750 Minc02751 Minc02752 3MIOP01117 Minc12095 Minc12096 Minc12097 3MIOP01124 Minc12193 Minc12194 Minc12195 3MIOP01128 Minc12213 Minc12214 Minc12215 3MIOP01132 Minc12257 Minc12258 Minc12259 3MIOP01137 Minc12299 Minc12300 Minc12301 3MIOP01139 Minc12330 Minc12331 Minc12332 3MIOP01146 Minc02875 Minc02876 Minc02877 3MIOP01189 Minc02990 Minc02991 Minc02992 3MIOP01190 Minc02994 Minc02995 Minc02996 3MIOP01199 Minc12726 Minc12727 Minc12728 3MIOP01213 Minc03072 Minc03073 Minc03074 3MIOP01216 Minc12773 Minc12774 Minc12775 3MIOP01217 Minc12782 Minc12783 Minc12784 3MIOP01231 Minc12904 Minc12905 Minc12906 3MIOP01233 Minc12934 Minc12935 Minc12936 3MIOP01245 Minc03150 Minc03151 Minc03152 3MIOP01256 Minc03205 Minc03206 Minc03207 3MIOP01263 Minc13201 Minc13202 Minc13203 3MIOP01273 Minc03260 Minc03261 Minc03262 3MIOP01275 Minc13289 Minc13290 Minc13291 3MIOP01281 Minc03302 Minc03303 Minc03304 3MIOP01284 Minc13379 Minc13380 Minc13381 3MIOP01288 Minc03315 Minc03316 Minc03317 3MIOP01291 Minc03329 Minc03330 Minc03331 3MIOP01294 Minc03342 Minc03343 Minc03344 3MIOP01295 Minc03359 Minc03360 Minc03361 3MIOP01297 Minc03370 Minc03371 Minc03372 3MIOP01300 Minc13524 Minc13525 Minc13526 3

Page 38: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP01301 Minc13533 Minc13534 Minc13535 3MIOP01302 Minc13555 Minc13556 Minc13557 3MIOP01318 Minc03404 Minc03405 Minc03406 3MIOP01320 Minc13602 Minc13603 Minc13604 3MIOP01325 Minc13678 Minc13679 Minc13680 3MIOP01330 Minc13714 Minc13715 Minc13716 3MIOP01333 Minc13743 Minc13744 Minc13745 3MIOP01335 Minc13754 Minc13755 Minc13756 3MIOP01337 Minc13760 Minc13761 Minc13762 3MIOP01338 Minc13781 Minc13782 Minc13783 3MIOP01366 Minc13959 Minc13960 Minc13961 3MIOP01375 Minc03611 Minc03612 Minc03613 3MIOP01376 Minc14001 Minc14002 Minc14003 3MIOP01378 Minc14030 Minc14031 Minc14032 3MIOP01379 Minc14041 Minc14042 Minc14043 3MIOP01386 Minc03645 Minc03646 Minc03647 3MIOP01391 Minc14107 Minc14108 Minc14109 3MIOP01392 Minc14121 Minc14122 Minc14123 3MIOP01400 Minc14181 Minc14182 Minc14183 3MIOP01401 Minc14193 Minc14194 Minc14195 3MIOP01403 Minc03681 Minc03682 Minc03683 3MIOP01404 Minc03684 Minc03685 Minc03686 3MIOP01408 Minc14235 Minc14236 Minc14237 3MIOP01418 Minc14314 Minc14315 Minc14316 3MIOP01429 Minc14379 Minc14380 Minc14381 3MIOP01437 Minc14437 Minc14438 Minc14439 3MIOP01441 Minc14475 Minc14476 Minc14477 3MIOP01446 Minc14516 Minc14517 Minc14518 3MIOP01447 Minc14525 Minc14526 Minc14527 3MIOP01450 Minc03813 Minc03814 Minc03815 3MIOP01455 Minc03855 Minc03856 Minc03857 3MIOP01458 Minc03859 Minc03860 Minc03861 3MIOP01467 Minc14687 Minc14688 Minc14689 3MIOP01478 Minc14759 Minc14760 Minc14761 3

Page 39: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP01480 Minc14775 Minc14776 Minc14777 3MIOP01482 Minc14783 Minc14784 Minc14785 3MIOP01484 Minc14812 Minc14813 Minc14814 3MIOP01498 Minc03978 Minc03979 Minc03980 3MIOP01499 Minc03985 Minc03986 Minc03987 3MIOP01508 Minc04008 Minc04009 Minc04010 3MIOP01512 Minc14995 Minc14996 Minc14997 3MIOP01513 Minc14999 Minc15000 Minc15001 3MIOP01519 Minc04051 Minc04052 Minc04053 3MIOP01527 Minc04093 Minc04094 Minc04095 3MIOP01529 Minc15118 Minc15119 Minc15120 3MIOP01532 Minc04109 Minc04110 Minc04111 3MIOP01550 Minc04202 Minc04203 Minc04204 3MIOP01551 Minc04205 Minc04206 Minc04207 3MIOP01555 Minc15289 Minc15290 Minc15291 3MIOP01558 Minc15325 Minc15326 Minc15327 3MIOP01561 Minc15340 Minc15341 Minc15342 3MIOP01578 Minc15500 Minc15501 Minc15502 3MIOP00001 Minc00020 Minc00021 2MIOP00003 Minc00030 Minc00031 2MIOP00005 Minc00043 Minc00044 2MIOP00006 Minc00064 Minc00065 2MIOP00007 Minc00071 Minc00072 2MIOP00008 Minc00105 Minc00106 2MIOP00011 Minc00160 Minc00161 2MIOP00012 Minc00724 Minc00725 2MIOP00014 Minc00731 Minc00732 2MIOP00017 Minc00751 Minc00752 2MIOP00018 Minc00755 Minc00756 2MIOP00019 Minc00759 Minc00760 2MIOP00020 Minc00771 Minc00772 2MIOP00022 Minc15605 Minc15606 2MIOP00025 Minc15627 Minc15628 2MIOP00026 Minc15647 Minc15648 2

Page 40: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00030 Minc04419 Minc04420 2MIOP00031 Minc04423 Minc04424 2MIOP00032 Minc15679 Minc15680 2MIOP00036 Minc04445 Minc04446 2MIOP00037 Minc04447 Minc04448 2MIOP00038 Minc04457 Minc04458 2MIOP00039 Minc15705 Minc15706 2MIOP00040 Minc15739 Minc15740 2MIOP00041 Minc15741 Minc15742 2MIOP00042 Minc15751 Minc15752 2MIOP00043 Minc04461 Minc04462 2MIOP00045 Minc15775 Minc15776 2MIOP00047 Minc15789 Minc15790 2MIOP00048 Minc15794 Minc15795 2MIOP00049 Minc15802 Minc15804 2MIOP00050 Minc15813 Minc15814 2MIOP00051 Minc15815 Minc15816 2MIOP00052 Minc04509 Minc04510 2MIOP00055 Minc04514 Minc04515 2MIOP00058 Minc15890 Minc15891 2MIOP00059 Minc15901 Minc15902 2MIOP00060 Minc15906 Minc15907 2MIOP00061 Minc15909 Minc15910 2MIOP00062 Minc15915 Minc15916 2MIOP00063 Minc15928 Minc15929 2MIOP00064 Minc15930 Minc15931 2MIOP00065 Minc04549 Minc04550 2MIOP00066 Minc04553 Minc04554 2MIOP00068 Minc15967 Minc15968 2MIOP00070 Minc04579 Minc04580 2MIOP00071 Minc04586 Minc04587 2MIOP00072 Minc04591 Minc04592 2MIOP00079 Minc16050 Minc16051 2MIOP00080 Minc16058 Minc16059 2

Page 41: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00081 Minc16062 Minc16063 2MIOP00082 Minc16066 Minc16067 2MIOP00083 Minc04657 Minc04658 2MIOP00084 Minc16070 Minc16071 2MIOP00085 Minc16104 Minc16105 2MIOP00086 Minc16108 Minc16109 2MIOP00087 Minc16116 Minc16117 2MIOP00088 Minc00807 Minc00808 2MIOP00089 Minc00812 Minc00813 2MIOP00090 Minc00819 Minc00820 2MIOP00091 Minc00829 Minc00830 2MIOP00092 Minc16121 Minc16122 2MIOP00094 Minc16127 Minc16128 2MIOP00095 Minc16129 Minc16130 2MIOP00097 Minc16154 Minc16153 2MIOP00099 Minc04694 Minc04695 2MIOP00100 Minc04698 Minc04699 2MIOP00101 Minc04703 Minc04704 2MIOP00102 Minc04719 Minc04720 2MIOP00104 Minc16175 Minc16176 2MIOP00105 Minc16187 Minc16188 2MIOP00107 Minc16205 Minc16206 2MIOP00108 Minc04763 Minc04764 2MIOP00109 Minc04735 Minc04736 2MIOP00110 Minc04749 Minc04750 2MIOP00112 Minc16223 Minc16224 2MIOP00113 Minc16228 Minc16229 2MIOP00115 Minc16250 Minc16251 2MIOP00116 Minc16256 Minc16257 2MIOP00117 Minc04770 Minc04771 2MIOP00118 Minc16264 Minc16265 2MIOP00122 Minc04817 Minc04818 2MIOP00123 Minc04821 Minc04822 2MIOP00124 Minc16290 Minc16291 2

Page 42: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00125 Minc16312 Minc16313 2MIOP00126 Minc04827 Minc04828 2MIOP00127 Minc04831 Minc04832 2MIOP00128 Minc04837 Minc04838 2MIOP00131 Minc16360 Minc16361 2MIOP00132 Minc16366 Minc16367 2MIOP00133 Minc16380 Minc16381 2MIOP00134 Minc04846 Minc04847 2MIOP00135 Minc04850 Minc04851 2MIOP00137 Minc04868 Minc04869 2MIOP00138 Minc16398 Minc16399 2MIOP00139 Minc16402 Minc16403 2MIOP00140 Minc16405 Minc16406 2MIOP00141 Minc16417 Minc16418 2MIOP00144 Minc16446 Minc16447 2MIOP00146 Minc16459 Minc16460 2MIOP00147 Minc16464 Minc16465 2MIOP00148 Minc04913 Minc04914 2MIOP00151 Minc16475 Minc16476 2MIOP00153 Minc16487 Minc16488 2MIOP00154 Minc16508 Minc16510 2MIOP00155 Minc04943 Minc04945 2MIOP00156 Minc04950 Minc04951 2MIOP00158 Minc16534 Minc16535 2MIOP00159 Minc16538 Minc16539 2MIOP00161 Minc16549 Minc16550 2MIOP00162 Minc16554 Minc16555 2MIOP00163 Minc00832 Minc00833 2MIOP00165 Minc00877 Minc00878 2MIOP00166 Minc00880 Minc00881 2MIOP00170 Minc04992 Minc04993 2MIOP00172 Minc16575 Minc16577 2MIOP00173 Minc16580 Minc16581 2MIOP00174 Minc16584 Minc16586 2

Page 43: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00176 Minc16604 Minc16605 2MIOP00178 Minc05005 Minc05006 2MIOP00179 Minc05009 Minc05010 2MIOP00183 Minc16635 Minc16636 2MIOP00185 Minc05043 Minc05044 2MIOP00186 Minc16662 Minc16663 2MIOP00187 Minc16667 Minc16668 2MIOP00188 Minc16677 Minc16678 2MIOP00189 Minc16688 Minc16689 2MIOP00192 Minc16700 Minc16701 2MIOP00193 Minc16706 Minc16707 2MIOP00195 Minc05080 Minc05081 2MIOP00196 Minc05094 Minc05095 2MIOP00198 Minc16740 Minc16741 2MIOP00199 Minc16742 Minc16743 2MIOP00202 Minc05115 Minc05116 2MIOP00203 Minc16804 Minc16805 2MIOP00205 Minc16825 Minc16826 2MIOP00206 Minc16828 Minc16829 2MIOP00207 Minc05129 Minc05130 2MIOP00208 Minc05133 Minc05134 2MIOP00209 Minc05142 Minc05143 2MIOP00210 Minc05153 Minc05154 2MIOP00211 Minc16839 Minc16840 2MIOP00212 Minc16848 Minc16849 2MIOP00213 Minc05167 Minc05168 2MIOP00215 Minc16873 Minc16874 2MIOP00216 Minc16878 Minc16879 2MIOP00217 Minc16884 Minc16883 2MIOP00218 Minc16898 Minc16899 2MIOP00220 Minc05200 Minc05201 2MIOP00222 Minc05215 Minc05216 2MIOP00223 Minc16917 Minc16918 2MIOP00225 Minc16949 Minc16950 2

Page 44: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00226 Minc16960 Minc16961 2MIOP00227 Minc16976 Minc16977 2MIOP00228 Minc16981 Minc16982 2MIOP00230 Minc00909 Minc00910 2MIOP00231 Minc00912 Minc00913 2MIOP00232 Minc00915 Minc00916 2MIOP00234 Minc00924 Minc00925 2MIOP00235 Minc00928 Minc00929 2MIOP00236 Minc00946 Minc00947 2MIOP00242 Minc16997 Minc16998 2MIOP00244 Minc17006 Minc17007 2MIOP00245 Minc17018 Minc17019 2MIOP00246 Minc17022 Minc17023 2MIOP00247 Minc17025 Minc17027 2MIOP00248 Minc17041 Minc17042 2MIOP00249 Minc17055 Minc17056 2MIOP00251 Minc17061 Minc17062 2MIOP00252 Minc05319 Minc05320 2MIOP00253 Minc05321 Minc05322 2MIOP00254 Minc05328 Minc05329 2MIOP00255 Minc17094 Minc17095 2MIOP00256 Minc17112 Minc17113 2MIOP00257 Minc17114 Minc17115 2MIOP00258 Minc17118 Minc17119 2MIOP00259 Minc05349 Minc05350 2MIOP00260 Minc05356 Minc05357 2MIOP00261 Minc17128 Minc17129 2MIOP00263 Minc17145 Minc17146 2MIOP00264 Minc17152 Minc17153 2MIOP00265 Minc17156 Minc17157 2MIOP00267 Minc17163 Minc17164 2MIOP00268 Minc05385 Minc05386 2MIOP00269 Minc05387 Minc05388 2MIOP00270 Minc17177 Minc17178 2

Page 45: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00271 Minc17193 Minc17194 2MIOP00272 Minc05411 Minc05412 2MIOP00276 Minc17214 Minc17215 2MIOP00277 Minc17230 Minc17231 2MIOP00278 Minc17233 Minc17234 2MIOP00281 Minc05449 Minc05450 2MIOP00284 Minc05473 Minc05474 2MIOP00285 Minc05476 Minc05477 2MIOP00286 Minc17296 Minc17297 2MIOP00288 Minc05483 Minc05485 2MIOP00290 Minc17316 Minc17317 2MIOP00291 Minc17329 Minc17330 2MIOP00292 Minc17332 Minc17333 2MIOP00293 Minc17336 Minc17337 2MIOP00294 Minc01003 Minc01004 2MIOP00300 Minc17345 Minc17346 2MIOP00301 Minc17359 Minc17360 2MIOP00302 Minc17363 Minc17364 2MIOP00303 Minc17373 Minc17374 2MIOP00304 Minc05527 Minc05528 2MIOP00305 Minc17388 Minc17389 2MIOP00306 Minc17399 Minc17400 2MIOP00307 Minc17408 Minc17409 2MIOP00309 Minc05556 Minc05557 2MIOP00311 Minc05567 Minc05568 2MIOP00312 Minc05582 Minc05583 2MIOP00313 Minc05589 Minc05590 2MIOP00314 Minc17453 Minc17454 2MIOP00315 Minc17466 Minc17467 2MIOP00317 Minc05602 Minc05603 2MIOP00318 Minc05609 Minc05610 2MIOP00320 Minc05621 Minc05622 2MIOP00321 Minc05624 Minc05625 2MIOP00323 Minc05631 Minc05632 2

Page 46: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00325 Minc17516 Minc17517 2MIOP00326 Minc17518 Minc17519 2MIOP00327 Minc17524 Minc17525 2MIOP00329 Minc05660 Minc05661 2MIOP00330 Minc17551 Minc17552 2MIOP00331 Minc05666 Minc05667 2MIOP00332 Minc05679 Minc05680 2MIOP00333 Minc05686 Minc05687 2MIOP00334 Minc05712 Minc05713 2MIOP00336 Minc17584 Minc17585 2MIOP00337 Minc17586 Minc17587 2MIOP00338 Minc05725 Minc05726 2MIOP00339 Minc05729 Minc05730 2MIOP00340 Minc05740 Minc05741 2MIOP00342 Minc01027 Minc01028 2MIOP00343 Minc01050 Minc01051 2MIOP00344 Minc01062 Minc01063 2MIOP00345 Minc05763 Minc05764 2MIOP00346 Minc05766 Minc05767 2MIOP00347 Minc05777 Minc05778 2MIOP00348 Minc17644 Minc17645 2MIOP00349 Minc05781 Minc05782 2MIOP00350 Minc05788 Minc05789 2MIOP00353 Minc05816 Minc05817 2MIOP00354 Minc05818 Minc05819 2MIOP00355 Minc17684 Minc17685 2MIOP00356 Minc17691 Minc17692 2MIOP00357 Minc17698 Minc17699 2MIOP00359 Minc17711 Minc17712 2MIOP00360 Minc05851 Minc05852 2MIOP00361 Minc05858 Minc05859 2MIOP00362 Minc17743 Minc17744 2MIOP00363 Minc17748 Minc17749 2MIOP00364 Minc05871 Minc05872 2

Page 47: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00367 Minc05898 Minc05899 2MIOP00368 Minc05913 Minc05914 2MIOP00369 Minc05916 Minc05917 2MIOP00370 Minc05928 Minc05929 2MIOP00371 Minc17768 Minc17769 2MIOP00372 Minc17772 Minc17773 2MIOP00373 Minc17775 Minc17776 2MIOP00375 Minc17796 Minc17797 2MIOP00377 Minc05962 Minc05963 2MIOP00378 Minc05964 Minc05965 2MIOP00380 Minc05977 Minc05978 2MIOP00382 Minc05984 Minc05985 2MIOP00383 Minc05992 Minc05993 2MIOP00385 Minc17838 Minc17839 2MIOP00388 Minc17859 Minc17860 2MIOP00389 Minc01072 Minc01073 2MIOP00390 Minc01086 Minc01087 2MIOP00391 Minc01089 Minc01090 2MIOP00393 Minc01107 Minc01108 2MIOP00395 Minc06025 Minc06026 2MIOP00397 Minc17869 Minc17870 2MIOP00398 Minc17872 Minc17873 2MIOP00399 Minc17883 Minc17884 2MIOP00400 Minc06034 Minc06035 2MIOP00402 Minc17898 Minc17899 2MIOP00403 Minc06073 Minc06075 2MIOP00405 Minc17939 Minc17940 2MIOP00406 Minc06084 Minc06085 2MIOP00407 Minc06093 Minc06094 2MIOP00408 Minc06107 Minc06108 2MIOP00409 Minc06110 Minc06111 2MIOP00415 Minc06176 Minc06177 2MIOP00418 Minc06199 Minc06200 2MIOP00419 Minc18011 Minc18012 2

Page 48: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00420 Minc18015 Minc18016 2MIOP00422 Minc06242 Minc06243 2MIOP00423 Minc06249 Minc06250 2MIOP00424 Minc18063 Minc18064 2MIOP00425 Minc01143 Minc01144 2MIOP00426 Minc01148 Minc01149 2MIOP00427 Minc01151 Minc01152 2MIOP00430 Minc01169 Minc01170 2MIOP00431 Minc06253 Minc06254 2MIOP00432 Minc06255 Minc06256 2MIOP00433 Minc06270 Minc06271 2MIOP00434 Minc06274 Minc06275 2MIOP00435 Minc06284 Minc06285 2MIOP00436 Minc18085 Minc18086 2MIOP00437 Minc18096 Minc18097 2MIOP00438 Minc06295 Minc06296 2MIOP00439 Minc06310 Minc06311 2MIOP00441 Minc06326 Minc06327 2MIOP00442 Minc06329 Minc06330 2MIOP00444 Minc06350 Minc06351 2MIOP00446 Minc18145 Minc18146 2MIOP00447 Minc18160 Minc18161 2MIOP00448 Minc06391 Minc06392 2MIOP00449 Minc18177 Minc18178 2MIOP00450 Minc06411 Minc06412 2MIOP00451 Minc06423 Minc06424 2MIOP00452 Minc06432 Minc06433 2MIOP00453 Minc06443 Minc06444 2MIOP00455 Minc18210 Minc18209 2MIOP00456 Minc06473 Minc06474 2MIOP00459 Minc01219 Minc01220 2MIOP00460 Minc01244 Minc01245 2MIOP00461 Minc06499 Minc06500 2MIOP00463 Minc06509 Minc06510 2

Page 49: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00464 Minc06519 Minc06520 2MIOP00466 Minc06531 Minc06532 2MIOP00467 Minc06556 Minc06557 2MIOP00468 Minc06565 Minc06566 2MIOP00469 Minc06582 Minc06583 2MIOP00470 Minc18281 Minc18282 2MIOP00471 Minc06593 Minc06594 2MIOP00472 Minc18297 Minc18298 2MIOP00473 Minc06617 Minc06618 2MIOP00474 Minc06631 Minc06632 2MIOP00476 Minc06654 Minc06655 2MIOP00477 Minc18323 Minc18322 2MIOP00478 Minc06671 Minc06672 2MIOP00479 Minc06680 Minc06681 2MIOP00482 Minc18346 Minc18347 2MIOP00483 Minc06706 Minc06707 2MIOP00485 Minc06718 Minc06719 2MIOP00486 Minc06722 Minc06723 2MIOP00487 Minc01271 Minc01272 2MIOP00488 Minc01249 Minc01250 2MIOP00489 Minc01253 Minc01254 2MIOP00491 Minc01274 Minc01275 2MIOP00492 Minc01280 Minc01281 2MIOP00493 Minc01285 Minc01286 2MIOP00494 Minc01290 Minc01291 2MIOP00495 Minc06732 Minc06731 2MIOP00497 Minc06756 Minc06757 2MIOP00499 Minc06773 Minc06774 2MIOP00500 Minc18382 Minc18383 2MIOP00502 Minc18394 Minc18395 2MIOP00503 Minc18397 Minc18398 2MIOP00504 Minc06807 Minc06808 2MIOP00505 Minc06818 Minc06819 2MIOP00506 Minc06826 Minc06827 2

Page 50: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00507 Minc06834 Minc06836 2MIOP00508 Minc06840 Minc06841 2MIOP00509 Minc18418 Minc18419 2MIOP00510 Minc06856 Minc06857 2MIOP00512 Minc06879 Minc06880 2MIOP00513 Minc06890 Minc06891 2MIOP00515 Minc06939 Minc06940 2MIOP00516 Minc00176 Minc00178 2MIOP00517 Minc00181 Minc00182 2MIOP00518 Minc00198 Minc00199 2MIOP00519 Minc00219 Minc00220 2MIOP00521 Minc01334 Minc01335 2MIOP00522 Minc01337 Minc01338 2MIOP00523 Minc01340 Minc01341 2MIOP00526 Minc06964 Minc06965 2MIOP00527 Minc18484 Minc18485 2MIOP00528 Minc06976 Minc06977 2MIOP00530 Minc06989 Minc06990 2MIOP00532 Minc18509 Minc18510 2MIOP00533 Minc07005 Minc07006 2MIOP00534 Minc07009 Minc07010 2MIOP00537 Minc07058 Minc07059 2MIOP00538 Minc07061 Minc07062 2MIOP00539 Minc07064 Minc07065 2MIOP00540 Minc18532 Minc18533 2MIOP00541 Minc07087 Minc07088 2MIOP00542 Minc07094 Minc07095 2MIOP00543 Minc07097 Minc07098 2MIOP00546 Minc07142 Minc07143 2MIOP00547 Minc18573 Minc18574 2MIOP00548 Minc18575 Minc18576 2MIOP00549 Minc07151 Minc07152 2MIOP00550 Minc01354 Minc01355 2MIOP00553 Minc07190 Minc07191 2

Page 51: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00556 Minc07252 Minc07253 2MIOP00558 Minc07259 Minc07260 2MIOP00560 Minc18649 Minc18648 2MIOP00561 Minc18651 Minc18652 2MIOP00564 Minc18664 Minc18665 2MIOP00565 Minc18672 Minc18673 2MIOP00567 Minc07344 Minc07345 2MIOP00568 Minc18696 Minc18697 2MIOP00571 Minc07382 Minc07383 2MIOP00572 Minc07396 Minc07397 2MIOP00574 Minc07417 Minc07418 2MIOP00575 Minc07413 Minc07414 2MIOP00576 Minc07430 Minc07431 2MIOP00577 Minc07440 Minc07441 2MIOP00579 Minc07458 Minc07459 2MIOP00581 Minc07495 Minc07496 2MIOP00584 Minc18757 Minc18758 2MIOP00586 Minc18762 Minc18763 2MIOP00587 Minc07565 Minc07566 2MIOP00589 Minc01475 Minc01476 2MIOP00591 Minc01456 Minc01457 2MIOP00595 Minc07596 Minc07597 2MIOP00598 Minc07630 Minc07631 2MIOP00599 Minc07635 Minc07636 2MIOP00600 Minc07646 Minc07647 2MIOP00602 Minc07659 Minc07660 2MIOP00603 Minc07665 Minc07666 2MIOP00604 Minc07669 Minc07670 2MIOP00605 Minc07675 Minc07676 2MIOP00606 Minc07677 Minc07678 2MIOP00607 Minc07680 Minc07681 2MIOP00608 Minc07695 Minc07696 2MIOP00609 Minc07703 Minc07705 2MIOP00614 Minc07743 Minc07744 2

Page 52: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00615 Minc01490 Minc01491 2MIOP00616 Minc01496 Minc01497 2MIOP00619 Minc18876 Minc18877 2MIOP00622 Minc07802 Minc07803 2MIOP00623 Minc07814 Minc07815 2MIOP00624 Minc07819 Minc07820 2MIOP00625 Minc07822 Minc07823 2MIOP00628 Minc07850 Minc07851 2MIOP00630 Minc07862 Minc07863 2MIOP00631 Minc07877 Minc07878 2MIOP00632 Minc07879 Minc07880 2MIOP00633 Minc07892 Minc07893 2MIOP00634 Minc07899 Minc07900 2MIOP00635 Minc07904 Minc07905 2MIOP00636 Minc07921 Minc07922 2MIOP00637 Minc07924 Minc07925 2MIOP00638 Minc18936 Minc18937 2MIOP00639 Minc18944 Minc18945 2MIOP00641 Minc01545 Minc01546 2MIOP00642 Minc01561 Minc01562 2MIOP00643 Minc07980 Minc07981 2MIOP00644 Minc07986 Minc07987 2MIOP00646 Minc08003 Minc08004 2MIOP00647 Minc08006 Minc08007 2MIOP00648 Minc08008 Minc08009 2MIOP00651 Minc08037 Minc08038 2MIOP00653 Minc08043 Minc08045 2MIOP00655 Minc08071 Minc08072 2MIOP00656 Minc08079 Minc08080 2MIOP00657 Minc08122 Minc08123 2MIOP00658 Minc08129 Minc08130 2MIOP00659 Minc08135 Minc08136 2MIOP00660 Minc08138 Minc08139 2MIOP00663 Minc08165 Minc08166 2

Page 53: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00664 Minc08172 Minc08173 2MIOP00665 Minc01591 Minc01592 2MIOP00666 Minc01620 Minc01621 2MIOP00667 Minc08183 Minc08185 2MIOP00670 Minc08223 Minc08224 2MIOP00671 Minc08230 Minc08232 2MIOP00672 Minc08250 Minc08251 2MIOP00673 Minc08252 Minc08253 2MIOP00674 Minc08259 Minc08260 2MIOP00675 Minc08268 Minc08269 2MIOP00676 Minc08279 Minc08280 2MIOP00678 Minc08318 Minc08319 2MIOP00679 Minc08328 Minc08329 2MIOP00680 Minc08336 Minc08337 2MIOP00683 Minc08358 Minc08359 2MIOP00684 Minc01638 Minc01639 2MIOP00686 Minc01656 Minc01657 2MIOP00687 Minc01665 Minc01666 2MIOP00690 Minc08371 Minc08372 2MIOP00691 Minc08383 Minc08384 2MIOP00692 Minc08389 Minc08390 2MIOP00695 Minc08447 Minc08448 2MIOP00696 Minc08478 Minc08479 2MIOP00697 Minc08482 Minc08483 2MIOP00698 Minc08489 Minc08490 2MIOP00699 Minc08508 Minc08509 2MIOP00700 Minc19138 Minc19139 2MIOP00702 Minc08560 Minc08561 2MIOP00705 Minc08584 Minc08585 2MIOP00707 Minc08594 Minc08595 2MIOP00709 Minc19174 Minc19173 2MIOP00712 Minc19179 Minc19180 2MIOP00713 Minc08658 Minc08659 2MIOP00714 Minc08663 Minc08664 2

Page 54: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00715 Minc08686 Minc08687 2MIOP00716 Minc19187 Minc19188 2MIOP00717 Minc01725 Minc01726 2MIOP00718 Minc01730 Minc01731 2MIOP00719 Minc01743 Minc01744 2MIOP00720 Minc08723 Minc08724 2MIOP00721 Minc08730 Minc08731 2MIOP00722 Minc08768 Minc08769 2MIOP00723 Minc08772 Minc08773 2MIOP00727 Minc08848 Minc08849 2MIOP00728 Minc08858 Minc08859 2MIOP00729 Minc00277 Minc00278 2MIOP00730 Minc00244 Minc00245 2MIOP00731 Minc00281 Minc00282 2MIOP00732 Minc00283 Minc00284 2MIOP00733 Minc00294 Minc00295 2MIOP00735 Minc01808 Minc01809 2MIOP00737 Minc01815 Minc01816 2MIOP00738 Minc08869 Minc08870 2MIOP00739 Minc08876 Minc08877 2MIOP00740 Minc08898 Minc08899 2MIOP00741 Minc08909 Minc08910 2MIOP00743 Minc08929 Minc08930 2MIOP00744 Minc08931 Minc08932 2MIOP00746 Minc08941 Minc08942 2MIOP00747 Minc08948 Minc08949 2MIOP00750 Minc08966 Minc08968 2MIOP00751 Minc08970 Minc08971 2MIOP00752 Minc08980 Minc08981 2MIOP00753 Minc08992 Minc08993 2MIOP00754 Minc08997 Minc08998 2MIOP00756 Minc09017 Minc09018 2MIOP00757 Minc09020 Minc09021 2MIOP00758 Minc01848 Minc01849 2

Page 55: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00759 Minc01828 Minc01829 2MIOP00760 Minc01832 Minc01833 2MIOP00762 Minc01853 Minc01854 2MIOP00764 Minc09072 Minc09073 2MIOP00765 Minc09112 Minc09114 2MIOP00767 Minc09122 Minc09123 2MIOP00768 Minc09125 Minc09126 2MIOP00769 Minc09127 Minc09128 2MIOP00771 Minc09189 Minc09190 2MIOP00772 Minc09199 Minc09200 2MIOP00773 Minc09205 Minc09207 2MIOP00775 Minc01912 Minc01913 2MIOP00777 Minc09216 Minc09217 2MIOP00778 Minc09226 Minc09227 2MIOP00779 Minc09240 Minc09241 2MIOP00780 Minc09266 Minc09267 2MIOP00783 Minc09293 Minc09294 2MIOP00786 Minc09331 Minc09332 2MIOP00791 Minc01919 Minc01920 2MIOP00792 Minc01928 Minc01929 2MIOP00793 Minc01933 Minc01934 2MIOP00796 Minc01950 Minc01951 2MIOP00799 Minc09434 Minc09435 2MIOP00800 Minc09440 Minc09441 2MIOP00802 Minc09457 Minc09458 2MIOP00805 Minc09504 Minc09505 2MIOP00806 Minc09507 Minc09508 2MIOP00807 Minc09512 Minc09513 2MIOP00809 Minc09525 Minc09526 2MIOP00810 Minc09533 Minc09534 2MIOP00811 Minc09540 Minc09541 2MIOP00813 Minc09554 Minc09555 2MIOP00814 Minc09563 Minc09564 2MIOP00815 Minc09576 Minc09577 2

Page 56: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00817 Minc09593 Minc09594 2MIOP00821 Minc09625 Minc09626 2MIOP00822 Minc09628 Minc09629 2MIOP00823 Minc09634 Minc09635 2MIOP00824 Minc09637 Minc09638 2MIOP00827 Minc09672 Minc09673 2MIOP00828 Minc09674 Minc09675 2MIOP00830 Minc09691 Minc09692 2MIOP00831 Minc09724 Minc09725 2MIOP00832 Minc09731 Minc09732 2MIOP00833 Minc09733 Minc09734 2MIOP00834 Minc09735 Minc09736 2MIOP00835 Minc02015 Minc02016 2MIOP00836 Minc02019 Minc02020 2MIOP00837 Minc09761 Minc09762 2MIOP00838 Minc09774 Minc09775 2MIOP00840 Minc09784 Minc09785 2MIOP00841 Minc09796 Minc09797 2MIOP00842 Minc09801 Minc09802 2MIOP00845 Minc09817 Minc09818 2MIOP00846 Minc09821 Minc09822 2MIOP00847 Minc09839 Minc09840 2MIOP00848 Minc09844 Minc09845 2MIOP00849 Minc09847 Minc09848 2MIOP00851 Minc09863 Minc09864 2MIOP00852 Minc09868 Minc09869 2MIOP00854 Minc09876 Minc09877 2MIOP00855 Minc09883 Minc09884 2MIOP00856 Minc09898 Minc09899 2MIOP00857 Minc09907 Minc09908 2MIOP00859 Minc09941 Minc09942 2MIOP00860 Minc09966 Minc09967 2MIOP00861 Minc09976 Minc09977 2MIOP00864 Minc10013 Minc10014 2

Page 57: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00865 Minc10026 Minc10027 2MIOP00866 Minc10036 Minc10037 2MIOP00868 Minc10058 Minc10059 2MIOP00869 Minc02073 Minc02074 2MIOP00870 Minc10064 Minc10065 2MIOP00872 Minc10087 Minc10088 2MIOP00873 Minc10102 Minc10103 2MIOP00874 Minc10107 Minc10108 2MIOP00877 Minc10161 Minc10162 2MIOP00878 Minc10186 Minc10187 2MIOP00879 Minc02124 Minc02125 2MIOP00880 Minc02135 Minc02136 2MIOP00885 Minc10233 Minc10234 2MIOP00887 Minc10252 Minc10253 2MIOP00890 Minc10273 Minc10274 2MIOP00892 Minc10283 Minc10285 2MIOP00893 Minc10289 Minc10290 2MIOP00895 Minc10316 Minc10317 2MIOP00896 Minc10325 Minc10326 2MIOP00899 Minc02137 Minc02138 2MIOP00900 Minc02155 Minc02156 2MIOP00902 Minc02181 Minc02182 2MIOP00903 Minc02187 Minc02188 2MIOP00904 Minc02190 Minc02191 2MIOP00905 Minc10345 Minc10346 2MIOP00906 Minc10348 Minc10349 2MIOP00908 Minc10360 Minc10361 2MIOP00911 Minc10410 Minc10411 2MIOP00913 Minc10436 Minc10437 2MIOP00914 Minc10442 Minc10443 2MIOP00915 Minc10448 Minc10449 2MIOP00917 Minc10463 Minc10464 2MIOP00918 Minc10466 Minc10467 2MIOP00919 Minc10470 Minc10471 2

Page 58: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00921 Minc00335 Minc00336 2MIOP00922 Minc00338 Minc00339 2MIOP00923 Minc00340 Minc00341 2MIOP00924 Minc00346 Minc00347 2MIOP00925 Minc00351 Minc00352 2MIOP00926 Minc00354 Minc00355 2MIOP00927 Minc00367 Minc00368 2MIOP00928 Minc02200 Minc02199 2MIOP00929 Minc02211 Minc02212 2MIOP00931 Minc02221 Minc02222 2MIOP00936 Minc10512 Minc10513 2MIOP00940 Minc10564 Minc10565 2MIOP00941 Minc10571 Minc10572 2MIOP00942 Minc10576 Minc10577 2MIOP00943 Minc10614 Minc10615 2MIOP00945 Minc02276 Minc02277 2MIOP00946 Minc10630 Minc10631 2MIOP00947 Minc10652 Minc10653 2MIOP00949 Minc10664 Minc10665 2MIOP00950 Minc10690 Minc10691 2MIOP00952 Minc10699 Minc10700 2MIOP00953 Minc10703 Minc10704 2MIOP00955 Minc10722 Minc10723 2MIOP00957 Minc02298 Minc02299 2MIOP00958 Minc02301 Minc02302 2MIOP00959 Minc02309 Minc02310 2MIOP00961 Minc10745 Minc10746 2MIOP00962 Minc10760 Minc10761 2MIOP00963 Minc10765 Minc10766 2MIOP00964 Minc10792 Minc10793 2MIOP00965 Minc10802 Minc10803 2MIOP00966 Minc10817 Minc10818 2MIOP00967 Minc10820 Minc10821 2MIOP00968 Minc10822 Minc10823 2

Page 59: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP00970 Minc02343 Minc02344 2MIOP00971 Minc02350 Minc02351 2MIOP00972 Minc10848 Minc10849 2MIOP00973 Minc10856 Minc10857 2MIOP00976 Minc10889 Minc10890 2MIOP00977 Minc10921 Minc10922 2MIOP00978 Minc10932 Minc10933 2MIOP00979 Minc10935 Minc10936 2MIOP00980 Minc10937 Minc10938 2MIOP00981 Minc10948 Minc10949 2MIOP00982 Minc10954 Minc10955 2MIOP00983 Minc10964 Minc10965 2MIOP00984 Minc10966 Minc10967 2MIOP00985 Minc10970 Minc10971 2MIOP00986 Minc02366 Minc02367 2MIOP00989 Minc02402 Minc02403 2MIOP00990 Minc02404 Minc02405 2MIOP00991 Minc10975 Minc10976 2MIOP00992 Minc10979 Minc10980 2MIOP00993 Minc10982 Minc10984 2MIOP00995 Minc11011 Minc11012 2MIOP00997 Minc11035 Minc11036 2MIOP00998 Minc11043 Minc11044 2MIOP00999 Minc11067 Minc11068 2MIOP01001 Minc11084 Minc11085 2MIOP01003 Minc02409 Minc02410 2MIOP01004 Minc02412 Minc02413 2MIOP01005 Minc11127 Minc11128 2MIOP01006 Minc11131 Minc11132 2MIOP01007 Minc11152 Minc11153 2MIOP01009 Minc11162 Minc11163 2MIOP01012 Minc11176 Minc11177 2MIOP01013 Minc11198 Minc11199 2MIOP01014 Minc11230 Minc11231 2

Page 60: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP01015 Minc11235 Minc11236 2MIOP01016 Minc11254 Minc11255 2MIOP01017 Minc11261 Minc11262 2MIOP01018 Minc11277 Minc11278 2MIOP01020 Minc11286 Minc11287 2MIOP01021 Minc11292 Minc11293 2MIOP01025 Minc11349 Minc11350 2MIOP01027 Minc11379 Minc11380 2MIOP01028 Minc11391 Minc11392 2MIOP01029 Minc11412 Minc11413 2MIOP01031 Minc11434 Minc11435 2MIOP01032 Minc11439 Minc11440 2MIOP01033 Minc11443 Minc11444 2MIOP01034 Minc11449 Minc11450 2MIOP01037 Minc11467 Minc11468 2MIOP01041 Minc11534 Minc11535 2MIOP01042 Minc11544 Minc11545 2MIOP01043 Minc11547 Minc11548 2MIOP01046 Minc11579 Minc11580 2MIOP01047 Minc11584 Minc11585 2MIOP01049 Minc02588 Minc02589 2MIOP01050 Minc11600 Minc11601 2MIOP01053 Minc11597 Minc11598 2MIOP01055 Minc11631 Minc11632 2MIOP01057 Minc11647 Minc11648 2MIOP01058 Minc11655 Minc11656 2MIOP01059 Minc11678 Minc11679 2MIOP01060 Minc00385 Minc00386 2MIOP01062 Minc00411 Minc00412 2MIOP01063 Minc00420 Minc00421 2MIOP01065 Minc00428 Minc00429 2MIOP01067 Minc11714 Minc11715 2MIOP01068 Minc11736 Minc11737 2MIOP01071 Minc11782 Minc11783 2

Page 61: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP01072 Minc11793 Minc11794 2MIOP01073 Minc02639 Minc02640 2MIOP01074 Minc02642 Minc02643 2MIOP01076 Minc02655 Minc02656 2MIOP01078 Minc02669 Minc02670 2MIOP01079 Minc11801 Minc11802 2MIOP01080 Minc11804 Minc11805 2MIOP01082 Minc11815 Minc11816 2MIOP01083 Minc11834 Minc11835 2MIOP01084 Minc11841 Minc11842 2MIOP01085 Minc11844 Minc11845 2MIOP01086 Minc11861 Minc11862 2MIOP01087 Minc11879 Minc11880 2MIOP01088 Minc11888 Minc11889 2MIOP01089 Minc02673 Minc02674 2MIOP01090 Minc02675 Minc02676 2MIOP01091 Minc02683 Minc02684 2MIOP01093 Minc11929 Minc11930 2MIOP01095 Minc11939 Minc11940 2MIOP01097 Minc11972 Minc11973 2MIOP01098 Minc11983 Minc11984 2MIOP01102 Minc02727 Minc02728 2MIOP01103 Minc02729 Minc02730 2MIOP01104 Minc02732 Minc02733 2MIOP01106 Minc02743 Minc02744 2MIOP01108 Minc02753 Minc02754 2MIOP01109 Minc02756 Minc02757 2MIOP01110 Minc12041 Minc12042 2MIOP01111 Minc12044 Minc12045 2MIOP01112 Minc12047 Minc12048 2MIOP01113 Minc12049 Minc12050 2MIOP01114 Minc12060 Minc12061 2MIOP01115 Minc12079 Minc12080 2MIOP01118 Minc12113 Minc12114 2

Page 62: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP01119 Minc02771 Minc02772 2MIOP01120 Minc02801 Minc02802 2MIOP01121 Minc12133 Minc12134 2MIOP01122 Minc12165 Minc12166 2MIOP01123 Minc12177 Minc12178 2MIOP01125 Minc12196 Minc12197 2MIOP01126 Minc12202 Minc12203 2MIOP01127 Minc12208 Minc12209 2MIOP01130 Minc02813 Minc02814 2MIOP01131 Minc02822 Minc02824 2MIOP01133 Minc12276 Minc12277 2MIOP01134 Minc12278 Minc12279 2MIOP01135 Minc12284 Minc12285 2MIOP01136 Minc12294 Minc12295 2MIOP01138 Minc12306 Minc12307 2MIOP01140 Minc12342 Minc12343 2MIOP01141 Minc12353 Minc12354 2MIOP01142 Minc02859 Minc02860 2MIOP01143 Minc02854 Minc02855 2MIOP01144 Minc02856 Minc02857 2MIOP01145 Minc02869 Minc02870 2MIOP01147 Minc02882 Minc02883 2MIOP01148 Minc12356 Minc12357 2MIOP01149 Minc12361 Minc12362 2MIOP01150 Minc12372 Minc12373 2MIOP01151 Minc12390 Minc12391 2MIOP01154 Minc12428 Minc12429 2MIOP01155 Minc12432 Minc12433 2MIOP01156 Minc12434 Minc12435 2MIOP01157 Minc12444 Minc12445 2MIOP01159 Minc02927 Minc02928 2MIOP01162 Minc02907 Minc02908 2MIOP01163 Minc02915 Minc02916 2MIOP01164 Minc12473 Minc12474 2

Page 63: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP01165 Minc12481 Minc12482 2MIOP01166 Minc12497 Minc12498 2MIOP01167 Minc12528 Minc12529 2MIOP01168 Minc12539 Minc12540 2MIOP01169 Minc02945 Minc02946 2MIOP01170 Minc02951 Minc02952 2MIOP01171 Minc02954 Minc02955 2MIOP01172 Minc02956 Minc02957 2MIOP01173 Minc02959 Minc02960 2MIOP01176 Minc12553 Minc12554 2MIOP01177 Minc12564 Minc12565 2MIOP01178 Minc12560 Minc12561 2MIOP01180 Minc12584 Minc12585 2MIOP01181 Minc12598 Minc12599 2MIOP01182 Minc12612 Minc12613 2MIOP01183 Minc12625 Minc12626 2MIOP01184 Minc12630 Minc12631 2MIOP01185 Minc12634 Minc12635 2MIOP01186 Minc12637 Minc12638 2MIOP01193 Minc03006 Minc03007 2MIOP01194 Minc12663 Minc12664 2MIOP01195 Minc12672 Minc12673 2MIOP01197 Minc12715 Minc12716 2MIOP01198 Minc12724 Minc12725 2MIOP01200 Minc12739 Minc12740 2MIOP01202 Minc00450 Minc00451 2MIOP01203 Minc00453 Minc00454 2MIOP01204 Minc00459 Minc00460 2MIOP01205 Minc00465 Minc00466 2MIOP01206 Minc00499 Minc00500 2MIOP01208 Minc03039 Minc03040 2MIOP01209 Minc03041 Minc03042 2MIOP01210 Minc03052 Minc03053 2MIOP01211 Minc03057 Minc03058 2

Page 64: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP01212 Minc03068 Minc03069 2MIOP01214 Minc12756 Minc12757 2MIOP01215 Minc12767 Minc12768 2MIOP01219 Minc12795 Minc12796 2MIOP01220 Minc12798 Minc12799 2MIOP01221 Minc12811 Minc12812 2MIOP01222 Minc12825 Minc12826 2MIOP01223 Minc12833 Minc12834 2MIOP01224 Minc12842 Minc12843 2MIOP01225 Minc03092 Minc03093 2MIOP01226 Minc12867 Minc12868 2MIOP01227 Minc12875 Minc12876 2MIOP01229 Minc12886 Minc12887 2MIOP01230 Minc12891 Minc12892 2MIOP01232 Minc12928 Minc12929 2MIOP01234 Minc12937 Minc12938 2MIOP01235 Minc03125 Minc03126 2MIOP01236 Minc03130 Minc03131 2MIOP01237 Minc03136 Minc03137 2MIOP01238 Minc03145 Minc03146 2MIOP01240 Minc12982 Minc12983 2MIOP01241 Minc12996 Minc12997 2MIOP01242 Minc13000 Minc13001 2MIOP01243 Minc13005 Minc13006 2MIOP01244 Minc13017 Minc13018 2MIOP01246 Minc13021 Minc13020 2MIOP01247 Minc13023 Minc13024 2MIOP01248 Minc13025 Minc13026 2MIOP01254 Minc03179 Minc03180 2MIOP01255 Minc03181 Minc03182 2MIOP01257 Minc13123 Minc13124 2MIOP01258 Minc13140 Minc13141 2MIOP01259 Minc13152 Minc13153 2MIOP01260 Minc13167 Minc13166 2

Page 65: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP01261 Minc13189 Minc13190 2MIOP01262 Minc03236 Minc03237 2MIOP01264 Minc13209 Minc13211 2MIOP01265 Minc13225 Minc13226 2MIOP01266 Minc13246 Minc13247 2MIOP01268 Minc13253 Minc13254 2MIOP01269 Minc13270 Minc13271 2MIOP01270 Minc13276 Minc13277 2MIOP01271 Minc03238 Minc03239 2MIOP01272 Minc03248 Minc03249 2MIOP01274 Minc03275 Minc03276 2MIOP01276 Minc13302 Minc13301 2MIOP01277 Minc13330 Minc13331 2MIOP01278 Minc13334 Minc13335 2MIOP01279 Minc13336 Minc13337 2MIOP01280 Minc13344 Minc13345 2MIOP01282 Minc13365 Minc13366 2MIOP01283 Minc13368 Minc13370 2MIOP01285 Minc13391 Minc13392 2MIOP01286 Minc13422 Minc13423 2MIOP01287 Minc13426 Minc13427 2MIOP01289 Minc03319 Minc03320 2MIOP01290 Minc03323 Minc03324 2MIOP01292 Minc03335 Minc03336 2MIOP01296 Minc03366 Minc03367 2MIOP01298 Minc03375 Minc03376 2MIOP01299 Minc03385 Minc03386 2MIOP01303 Minc13574 Minc13575 2MIOP01304 Minc13588 Minc13589 2MIOP01305 Minc00518 Minc00519 2MIOP01307 Minc00527 Minc00528 2MIOP01308 Minc00530 Minc00531 2MIOP01309 Minc00535 Minc00536 2MIOP01310 Minc00549 Minc00550 2

Page 66: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP01311 Minc00551 Minc00552 2MIOP01312 Minc00553 Minc00555 2MIOP01315 Minc00584 Minc00585 2MIOP01316 Minc00589 Minc00590 2MIOP01317 Minc03397 Minc03398 2MIOP01319 Minc03413 Minc03414 2MIOP01321 Minc13641 Minc13642 2MIOP01322 Minc13658 Minc13659 2MIOP01323 Minc13668 Minc13669 2MIOP01324 Minc13675 Minc13676 2MIOP01327 Minc03438 Minc03439 2MIOP01328 Minc13702 Minc13703 2MIOP01329 Minc13711 Minc13712 2MIOP01331 Minc13717 Minc13718 2MIOP01332 Minc13734 Minc13735 2MIOP01334 Minc13751 Minc13752 2MIOP01336 Minc03473 Minc03474 2MIOP01339 Minc13786 Minc13787 2MIOP01340 Minc13805 Minc13806 2MIOP01341 Minc13808 Minc13809 2MIOP01342 Minc13812 Minc13813 2MIOP01343 Minc13815 Minc13816 2MIOP01344 Minc13827 Minc13826 2MIOP01345 Minc13821 Minc13822 2MIOP01346 Minc13830 Minc13831 2MIOP01347 Minc03491 Minc03492 2MIOP01348 Minc03497 Minc03498 2MIOP01349 Minc13843 Minc13844 2MIOP01350 Minc13851 Minc13852 2MIOP01352 Minc13874 Minc13875 2MIOP01353 Minc13877 Minc13878 2MIOP01355 Minc13896 Minc13897 2MIOP01357 Minc03533 Minc03534 2MIOP01358 Minc03543 Minc03544 2

Page 67: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP01359 Minc03548 Minc03549 2MIOP01360 Minc03553 Minc03554 2MIOP01361 Minc03557 Minc03558 2MIOP01362 Minc13939 Minc13940 2MIOP01363 Minc13946 Minc13947 2MIOP01364 Minc13949 Minc13950 2MIOP01365 Minc13957 Minc13958 2MIOP01368 Minc13986 Minc13987 2MIOP01369 Minc03559 Minc03560 2MIOP01371 Minc03575 Minc03576 2MIOP01372 Minc03581 Minc03582 2MIOP01373 Minc03591 Minc03592 2MIOP01377 Minc14020 Minc14021 2MIOP01380 Minc14057 Minc14058 2MIOP01381 Minc14061 Minc14062 2MIOP01382 Minc14065 Minc14066 2MIOP01383 Minc03620 Minc03621 2MIOP01384 Minc03629 Minc03630 2MIOP01387 Minc03648 Minc03649 2MIOP01389 Minc14098 Minc14099 2MIOP01390 Minc14105 Minc14106 2MIOP01393 Minc14128 Minc14129 2MIOP01394 Minc14134 Minc14135 2MIOP01395 Minc03656 Minc03657 2MIOP01396 Minc03662 Minc03663 2MIOP01397 Minc14152 Minc14153 2MIOP01398 Minc14162 Minc14163 2MIOP01402 Minc03676 Minc03677 2MIOP01405 Minc03689 Minc03690 2MIOP01406 Minc14213 Minc14214 2MIOP01409 Minc14238 Minc14239 2MIOP01410 Minc14249 Minc14250 2MIOP01411 Minc03691 Minc03692 2MIOP01412 Minc03698 Minc03699 2

Page 68: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP01413 Minc03706 Minc03707 2MIOP01414 Minc14268 Minc14269 2MIOP01415 Minc14291 Minc14292 2MIOP01416 Minc14293 Minc14294 2MIOP01417 Minc14309 Minc14310 2MIOP01420 Minc00591 Minc00592 2MIOP01422 Minc00620 Minc00621 2MIOP01423 Minc00629 Minc00630 2MIOP01424 Minc00638 Minc00639 2MIOP01425 Minc00644 Minc00645 2MIOP01426 Minc00646 Minc00648 2MIOP01427 Minc03727 Minc03728 2MIOP01428 Minc14366 Minc14367 2MIOP01430 Minc14385 Minc14386 2MIOP01432 Minc14408 Minc14409 2MIOP01433 Minc03769 Minc03770 2MIOP01434 Minc03754 Minc03755 2MIOP01435 Minc03764 Minc03765 2MIOP01436 Minc14426 Minc14427 2MIOP01438 Minc14457 Minc14459 2MIOP01439 Minc14462 Minc14463 2MIOP01440 Minc14469 Minc14470 2MIOP01443 Minc14491 Minc14492 2MIOP01444 Minc03811 Minc03812 2MIOP01448 Minc14536 Minc14537 2MIOP01449 Minc14549 Minc14550 2MIOP01451 Minc03818 Minc03819 2MIOP01453 Minc03832 Minc03833 2MIOP01456 Minc14569 Minc14570 2MIOP01457 Minc14588 Minc14589 2MIOP01459 Minc03877 Minc03878 2MIOP01460 Minc03881 Minc03882 2MIOP01461 Minc14622 Minc14623 2MIOP01462 Minc14636 Minc14637 2

Page 69: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP01464 Minc14665 Minc14666 2MIOP01465 Minc14670 Minc14671 2MIOP01466 Minc14678 Minc14679 2MIOP01468 Minc14700 Minc14701 2MIOP01469 Minc14709 Minc14710 2MIOP01470 Minc14719 Minc14720 2MIOP01471 Minc14724 Minc14726 2MIOP01472 Minc03916 Minc03917 2MIOP01473 Minc03919 Minc03920 2MIOP01474 Minc03925 Minc03926 2MIOP01475 Minc14745 Minc14746 2MIOP01476 Minc14748 Minc14749 2MIOP01477 Minc14751 Minc14752 2MIOP01479 Minc14764 Minc14765 2MIOP01481 Minc14779 Minc14780 2MIOP01483 Minc14788 Minc14789 2MIOP01485 Minc03944 Minc03945 2MIOP01486 Minc03957 Minc03958 2MIOP01487 Minc03971 Minc03972 2MIOP01488 Minc14820 Minc14821 2MIOP01489 Minc14823 Minc14824 2MIOP01490 Minc14827 Minc14828 2MIOP01491 Minc14842 Minc14843 2MIOP01492 Minc14847 Minc14848 2MIOP01493 Minc14861 Minc14863 2MIOP01494 Minc14867 Minc14868 2MIOP01495 Minc14878 Minc14879 2MIOP01496 Minc14883 Minc14884 2MIOP01497 Minc14885 Minc14886 2MIOP01501 Minc14898 Minc14899 2MIOP01502 Minc14906 Minc14907 2MIOP01503 Minc14909 Minc14910 2MIOP01504 Minc14913 Minc14914 2MIOP01505 Minc14923 Minc14924 2

Page 70: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP01506 Minc14931 Minc14933 2MIOP01507 Minc14935 Minc14936 2MIOP01509 Minc14949 Minc14950 2MIOP01510 Minc14959 Minc14960 2MIOP01511 Minc14961 Minc14962 2MIOP01514 Minc00668 Minc00669 2MIOP01515 Minc00690 Minc00691 2MIOP01516 Minc00696 Minc00697 2MIOP01518 Minc04044 Minc04045 2MIOP01522 Minc15037 Minc15038 2MIOP01523 Minc15054 Minc15055 2MIOP01524 Minc15065 Minc15066 2MIOP01525 Minc04074 Minc04075 2MIOP01526 Minc04083 Minc04084 2MIOP01528 Minc15114 Minc15115 2MIOP01530 Minc15121 Minc15122 2MIOP01531 Minc15123 Minc15124 2MIOP01533 Minc04113 Minc04114 2MIOP01535 Minc04121 Minc04122 2MIOP01536 Minc04131 Minc04132 2MIOP01537 Minc15135 Minc15136 2MIOP01538 Minc15141 Minc15142 2MIOP01540 Minc15172 Minc15173 2MIOP01541 Minc15191 Minc15192 2MIOP01542 Minc15203 Minc15204 2MIOP01543 Minc15209 Minc15210 2MIOP01544 Minc15220 Minc15221 2MIOP01545 Minc15227 Minc15228 2MIOP01546 Minc15243 Minc15244 2MIOP01547 Minc04162 Minc04163 2MIOP01548 Minc04181 Minc04182 2MIOP01549 Minc04188 Minc04189 2MIOP01553 Minc15262 Minc15263 2MIOP01554 Minc15284 Minc15285 2

Page 71: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

MIOP01556 Minc15295 Minc15296 2MIOP01557 Minc04240 Minc04241 2MIOP01559 Minc15330 Minc15331 2MIOP01560 Minc15332 Minc15333 2MIOP01562 Minc15347 Minc15348 2MIOP01563 Minc15355 Minc15356 2MIOP01564 Minc15358 Minc15359 2MIOP01565 Minc15380 Minc15381 2MIOP01567 Minc15389 Minc15390 2MIOP01568 Minc15393 Minc15394 2MIOP01570 Minc04267 Minc04268 2MIOP01571 Minc04287 Minc04288 2MIOP01572 Minc15459 Minc15460 2MIOP01573 Minc15467 Minc15468 2MIOP01574 Minc15475 Minc15476 2MIOP01575 Minc15486 Minc15487 2MIOP01576 Minc04303 Minc04304 2MIOP01580 Minc04327 Minc04328 2MIOP01581 Minc04336 Minc04337 2MIOP01582 Minc04341 Minc04342 2MIOP01583 Minc15560 Minc15561 2MIOP01584 Minc15558 Minc15559 2MIOP01585 Minc15584 Minc15585 2

Page 72: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

25

Table S5 | Conservation of operons between B. malayi, C. elegans and M.

incognita

Number of: B. malayi C. elegans

M. incognita genes with orthologs 14,503 8,098

genes with orthologs in M. incognita operons 3,083 (21%) 1,879 (23%)

M. incognita operonic genes with orthologs also in operons 336 516

operons conserved with M. incognita 3 9

4.3 Summary

Operons in nematodes are a dynamic feature of genome architecture. The few instances of

absolute conservation of operons between species, and the apparent lability of operon

membership of individual genes, suggests that whatever process is moving genes into and out

of operons is not driven by core biochemical processes, such as need for transcription at high

levels, which is common to all of the species studied. There is no clear functional linkage

between genes in most operons (though there are exceptions to this general rule19, 29

).

Page 73: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

26

5 Protein coding gene set

A total of 20,359 gene models have been detected in the genome of M. incognita

(Supplementary Methods, section 8.8) leading to a number of 19,212 distinct loci (the

difference being attributed to splice variants).

Page 74: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

27

5.1 Supplementary results

Table S6 | M. incognita general features and comparative genome content.

Features M. incognita B. malayi28

C. elegans28

Overall

Estimated size of genome (Mb) 47-51* 90-95* 100*

Total size of assembled sequence (Mb) 86 88 100

Number of scaffolds / chromosomes 2,817 8,180 6 chr.

N50 of scaffolds (kb) 82.8 93.8

Entirely sequenced

Maximum length of scaffold (bp) 593,295 6,534,162

Number of bp assembled into scaffolds 86,079,672 70,837,048

Number of orphan contigs / 18,868

Number of bp assembled into orphan contigs / 17,526,009

Number of singletons (Mi : unplaced reads) 267,656 176,099

Number of bp in singletons 164,646,515 108,289,205

Protein-coding regions

Percent of genome containing protein-coding sequence (%) 25.32 17.84 25.55c

Number of gene models 19,212 11,515a 20,072

b

Number of proteins 20,365 11,508a 23,541

b

Max/average protein length (amino acids) 5,970/354 9,445/371 18,563/440

Gene density (genes per Mb) 223 162 228

Number of exons 127,119 83,672 146,294b

Mean/median exon size (bp) 169/136 159/140 307/147

Mean/median number of exons per gene 6.62/5 7.27/5 6.38/6

Number of bp included in exons 21,776,820 13,282,846 31,498,196b

Number of introns 109,181 72,157 138,596b

Mean/median intron size (bp) 230/82 311/219 320/68

Number of bp included in introns 25,081,026 22,512,502 35,536,104b

Mean length of intergenic region (bp) 1,402 3,783 2,218

Overall G + C content (%) 31.4 30.5 35.4

Exons, G + C content (%) 35.5 39.6 42.9

Introns, G + C content (%) 29.2 27.6 29.1

Intergenic regions, G + C content (%) 31.4 30.9 32.5

Non-protein coding genes

Transfer RNA (tRNA) genes (+ tRNA pseudogenes) 467 (120) ~233 (26) 608 (213)c

5S ribosomal RNA (found in scaffolds and orphan contigs) ~29 ~400 ~100c

aNumber of genes includes seven predicted pseudogenes. Splice variants were not taken into account. Between

14,500 and 17,800 genes have been estimated after inclusion of genes potentially present in unannotated portion of the genome

28.

bAccording to Wormpep version 183.

c According to Spieth et al. 2006

30

* M. incognita: flow cytometry12

; B. malayi: flow cytometry and clone-based28

; C. elegans genome has been completely sequenced telomere to telomere (no gaps) and is exactly 100,291,840 bp

31.

Page 75: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

28

5.2 Similarity pattern between predicted proteins

Because of the presence of two highly divergent haplotypes with aneuploidy, we analyzed, at

the protein level, the percentage of identity between predicted proteins. For that, we ran the

CD-hit program32

on all the M. incognita predicted protein sequences to identify clusters of

proteins that were at least 95% identical over their entire length (without internal gaps).

Where proteins are of different lengths they were grouped in the same cluster if the smaller

one aligned on its entire length with the longer with at least 95% identity. Results are

presented in Table S7. A total of 14,121 proteins were less than 95% identical with another

protein. CD-hit grouped 4,338 proteins in 2,169 clusters of two, 95% identical, sequences.

The largest cluster consisted of 17 proteins that were at least 95% identical with one another.

Using a threshold of 95% identity, a total of 16,832 proteins can be considered as different.

Table S7 | CD-HIT clustering of M. incognita proteins at 95% identity threshold.

Cluster size Number of proteins Numbers of clusters

1 14,121 14,121

2 4,338 2,169

3 1,134 378

4 440 110

5 - 10 309 53

11 - 15 0 0

16 - 20 17 1

> 20 0 0

Total 20,359 16,832

To further investigate the highly divergent allelic copies detected during the genome assembly

(Supplementary Data, section 2.2), we calculated the percentage of identity between M.

incognita proteins identified as belonging to an allelic pair (Fig. S10). We observed a wide

range of divergence. It is therefore possible that many of these allelic gene pairs have

functionally diverged. For this reason, we considered the 19,212 genes as the constituents of

the M. incognita predicted proteome, although a large proportion of them were allelic

(diverged) pairs.

Page 76: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

29

Figure S10 | Percent identity for allelic predicted proteins (3,882 pairs). The allelic

versions of each protein were detected on the basis of their position in allelic supercontigs. The x axis shows the

percent identity and the y axis the fraction of pairs belonging to each identity category.

Page 77: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

30

6 Automatic functional annotation

Gene functions were automatically assigned to 54.7% of the predicted proteins. This

assignment was based on the identification of InterPro33

(IPR) domains (release 16.1) using

InterproScan34

. In order to provide a non redundant description of the protein function, for a

given protein, when several overlapping domains were found, only the last common ancestral

term in the Interpro hierarchy was kept (Figure S11 and Supplementary Methods, section

8.9). Approximately 50% of M. incognita predicted proteins fell into OrthoMCL clusters with

at least one other species. The remainder comprised either OrthoMCL clusters with only M.

incognita in-paralogs or M. incognita proteins that could not be assigned to any cluster

(Supplementary Methods, section 8.9). A summary of results obtained via the Interpro and

OrthoMCL analyses is available in Table S8 and Figure S12 where all splice variants were

considered. An overview of the M. incognita protein set, where splice variants have been

removed from the analysis, is provided in Table S10.

Page 78: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

31

Figure S11 I Distribution of gene ontology terms assigned to M.incognita

genes.

A total of 41% of the 20,359 predicted M. incognita genes were assigned a gene ontology (GO) term. For C.

elegans: 49% of the 23,541 genes were assigned a GO term. For B. malayi: 47% of the 11,515 genes. For D.

melanogaster: 60% of the 20,822 genes were assigned a GO term. A predicted gene can have more than one

GO term. The y axis indicates the percentage of the term compared to the total of the terms found in the species.

Page 79: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

32

Table S8 | Comparative analysis of the number of proteins found with InterPro

domains, signal peptides and the number of proteins which share an orthoMCL

group

Species Nb of proteins with IPR in an orthoMCL

group with signalp

M. incognita 20,359 11,149 (54.7%) 16,754 (82.3%) 4,438

C. elegans 23,541 16,575 (70.4%) 20,133 (85.5%) 7,921

C. briggsae 19,511 12,886 (66.0%) 17,066 (87.5%) 5,869

B. malayi 11,515 7,174 (62.3%) 8,070 (70.0%) 2,002

D. melanogaster 20,822 16,009 (76.9%) 17,008 (81.7%) 1,628

M. grisea 11,109 6,790 (61.1%) 8,278 (74.5%) 2,532

G. zea 9,820 6,696 (68.2%) 7,388 (75.2%) 2,066

N. crassa 10,079 5,753 (57.1%) 7,049 (69.9%) 1,727

Page 80: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

33

Table S9 | Interpro (IPR) domain abundance in M. incognita genome compared

to the C.elegans, B. malayi and D. melanogaster genomes.

IPR M. inc

prot nb / rk C. ele

prot nb / rk B. mal

prot nb / rk D. mel

prot nb / rk IPR_description

The 10 most represented InterPro domains in M. incognita

IPR011009 416 / 1 683 / 1 297 / 1 651 / 1 Protein kinase-like

IPR013083 205 / 2 223 / 14 118 / 6 251 / 12 Zinc finger, RING/FYVE/PHD-type

IPR000276 202 / 3 497 / 3 58 / 23 137 / 35 Rhodopsin-like GPCR superfamily

IPR015880 186 / 4 306 / 9 261 / 2 529 / 2 Zinc finger, C2H2-like

IPR000504 185 / 5 163 / 25 116 / 7 345 / 4 RNA recognition motif, RNP-1

IPR012677 183 / 6 158 / 28 116 / 7 329 / 5 Nucleotide-binding, alpha-beta plait

IPR015943 181 / 7 219 / 15 170 / 3 312 / 6 WD40/YVTN repeat-like

IPR009057 181 / 7 195 / 19 101 / 8 289 / 10 Homeodomain-like

IPR001841 181 / 7 189 / 21 86 / 11 208 / 21 Zinc finger, RING-type

IPR008160 169 / 8 198 / 18 93 / 9 14 / 112 Collagen triple helix repeat

10 IPR domains with an increased abundance in M. incognita compared to C. elegans

IPR008906 66 / 40 10 / 117 2 / 71 1 / 125 HAT dimerisation

IPR012816 47 / 52 6 / 121 3 / 70 0 / 126 Conserved hypothetical protein CHP02464

IPR011050 41 / 58 4 / 123 3 / 70 3 / 123 Pectin lyase fold/virulence factor

IPR001547 35 / 63 1 / 126 4 / 69 9 / 117 Glycoside hydrolase, family 5

IPR011068 33 / 65 3 / 124 1 / 72 3 / 123 Nucleotidyltransferase, class I, C-terminal-like

IPR001503 32 / 66 5 / 122 2 / 71 4 / 122 Glycosyl transferase, family 10

IPR003653 26 / 72 5 / 122 4 / 69 15 / 111 Peptidase C48, SUMO/Sentrin/Ubl1

IPR008269 13 / 85 2 / 125 1 / 72 2 / 124 Peptidase S16, lon C-terminal

IPR002872 5 / 93 1 / 126 1 / 72 8 / 118 Proline dehydrogenase

IPR013333 5 / 93 1 / 126 0 / 73 4 / 122 Ryanodine receptor, N-terminal

10 IPR domains with a decreased abundance in M. incognita compared to C. elegans

IPR003002 26 / 72 641 / 2 0 / 73 0 / 126 7TM chemoreceptor, subfamily 1

IPR000324 20 / 78 140 / 31 17 / 56 31 / 95 Vitamin D receptor

IPR002656 6 / 92 65 / 65 1 / 72 18 / 108 Acyltransferase 3

IPR000494 2 / 96 62 / 68 2 / 71 7 / 119 EGF receptor, L domain

IPR004151 11 / 87 59 / 71 0 / 73 0 / 126 C. elegans Sre G protein-coupled

chemoreceptor

IPR000609 2 / 96 47 / 80 0 / 73 0 / 126 Serpentine gamma receptor, Caenorhabditis

species

IPR000344 1 / 97 46 / 81 2 / 71 0 / 126 Nematode chemoreceptor, Sra

IPR002485 7 / 91 36 / 91 0 / 73 0 / 126 Protein of unknown function DUF13

IPR002184 0 / 98 23 / 104 0 / 73 1 / 125 Serpentine beta receptor

IPR013604 0 / 98 4 / 123 0 / 73 61 / 66 7TM chemoreceptor

10 IPR domains found only in M. incognita

IPR008965 19 / 79 0 / 127 0 / 73 0 / 126 Carbohydrate-binding *Cellulose Binding

IPR009009 16 / 82 0 / 127 0 / 73 0 / 126 Barwin-related endoglucanase

IPR002701 5 / 93 0 / 127 0 / 73 0 / 126 Chorismate mutase

IPR000743 4 / 94 0 / 127 0 / 73 0 / 126 Glycoside hydrolase, family 28

IPR001809 4 / 94 0 / 127 0 / 73 0 / 126 Outer surface lipoprotein, Borrelia

IPR000685 3 / 95 0 / 127 0 / 73 0 / 126 Ribulose bisphosphate carboxylase, large chain

IPR006038 3 / 95 0 / 127 0 / 73 0 / 126 Uteroglobin superfamily

IPR005503 3 / 95 0 / 127 0 / 73 0 / 126 Flagellar basal body-associated protein FliL

IPR009101 3 / 95 0 / 127 0 / 73 0 / 126 Gurmarin-like inhibitors/Antifungal toxin

IPR000777 1 / 97 0 / 127 0 / 73 0 / 126 Envelope glycoprotein GP120

The number of proteins with the corresponding root IPR domain (prot nb) and the rank order (rk) for each genome

are indicated.

Page 81: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

34

Figure S12 | Domain sharing between sequenced genomes.

a, Venn diagram illustrating the distribution of identified domains (using INTERPRO domains) in the M. incognita

(Mi) proteome compared to the proteomes of C. elegans, B. malayi and D. melanogaster. Only ~2% of previously

recognized domains are present uniquely in M. incognita. 27 of the 52 M. incognita specific interpro domains are

supported by ESTs data. b, Venn diagram illustrating the distribution of orthologous proteins (using orthoMCL)

between M. incognita, C. elegans, B. malayi and D.melanogaster. Approximately ~25% of the M. incognita

clusters are not shared with C. elegans, B. malayi and D. melanogaster.

Table S10 | Overview of general information about M.incognita proteins.

19,212 M. incognita proteins

10,748 with IPR (54.2 %)

4,250 with SP (22.1 %)

6,858 with ESTs (35.7%)

SP Mi specific with IPR 759 202 9,960 (51.8%) 3,128 (31.4 %) no SP

2,369 684 Mi-restricted groups:

6,522

SP

Mi not in groups: 3,438 without IPR 1,819 338

6,832 (68.6 %) no SP 5,013 930

2,578 with SP

(25.9 %) 2,154 with ESTs

(21.6 %) SP with IPR 1,258 661 7,290 (78.8 %) no SP

Mi shared 6,032 3,330 9,252 (48.2%) SP

without IPR 414 137 1,962 (21.2 %) no SP 1,548 576

1,672 with SP

(18.1 %) 4,704 with ESTs

(50.8 %) Features of M. incognita proteins within and excluded from OrthoMCL groups (IPR: Interpro domain, SP: SignalP). This table does not take into account alternative genes.

a b

Page 82: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

35

7 Expert functional annotation

Results of automatic annotation and comparison were made available to the consortium of

expert annotators through a web site maintained at URL http://meloidogyne.toulouse.inra.fr/,

linked to a number of analytical tools. The gene models (EuGene predictions) and results of

similarity searches were loaded into the GMOD schema (also known as Chado database, see

http://www.gmod.org/) and these various features can be visualized with two genome

browsers: Gbrowse and Apollo35

. Results of the automated functional annotations were

synthesized in a report for each protein. A BLAST server was also made available with

possibility of BLASTing against the M. incognita predicted proteins, assembly, unplaced

reads, ESTs and sequences from other nematodes. The results of OrthoMCL clustering,

Interproscan and additional tools (similarity based families, Kegg pathways, etc) were also

proposed on the consortium web site.

Each laboratory from the consortium carefully and manually annotated gene families using

the tools made available and using their own expertise and routines. Basically, patterns of

presence / absence and expansion / reduction in comparison to C. elegans, other nematodes

(and other species when appropriate) were examined. The quality of the corresponding gene

models was manually checked and potential errors reported. A functional annotation was

proposed according to similarity with characterized proteins. Methods used for each process /

family are detailed in Supplementary Methods, sections 8.10 – 8.20.

Page 83: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

36

7.1 Carbohydrate Active enZymes (CAZymes)

We detected and annotated the repertoire of candidate CAZymes encoded by M. incognita

genome, which we compared to those of two other metazoan species, D. melanogaster and C.

elegans (Supplementary Methods, section 8.10). CAZymes are organized in different

classes (GH, GT, CBM, PL, CE). These classes are further divided into families. CAZymes

are involved in the biosynthesis, modification and degradation of glycoconjugates, oligo- and

polysaccharides. Depending on the types of carbohydrates involved in the metabolism of each

organism, their lifestyle and environment, different patterns of presence / absence, reduction /

expansion of CAZyme families can be expected.

M. incognita features a set of CAZymes similar in overall number to those of the other

metazoan we have analyzed (Table S11) but different in composition and distribution.

Table S11 | Comparison of the CAZyme repertoire in M. incognita, C. elegans

and D. melanogaster.

Species / Class GH fam GT fam CBM fam CE fam PL fam EXPN

M. incognita 86 22 171 25 32 6 2 1 31 1 20

C. elegans 95 22 241 36 59 5 2 2 0 0 0

D. melanogaster 97 22 144 39 245 7 3 3 0 0 0

Abundance and number of families (fam) found in each CAZyme class for each organism. GH stands for

Glycoside Hydrolases, GT for GlycosylTransferases, CBM for Carbohydrate Binding Modules, CE for

Carbohydrate Esterases, PL for Polysaccharide Lyases and EXPN for expansin-like proteins, following the CAZy

nomenclature (http://www.cazy.org).

The most important difference in M. incognita was the presence of a unique set of Plant Cell

Wall-degrading CAZymes accompanied by expansins and invertases. When we searched for

further conservation of these enzymes across the domains of life, we uncovered patchy

patterns of presence. Indeed, except for other plant parasitic nematodes (PPNs), the most

similar BLAST hits were always from bacterial or fungal origin (Table S12, Supplementary

Methods, section 8.10). Thus it appears that these M. incognita proteins used in plant

parasitism may have been acquired via Horizontal Gene Transfer (HGT). An alternative

hypothesis is that these enzymes were vertically acquired from a common ancestor of

eukaryotes and prokaryotes but retained only in a few eukaryotic lineages (including plant-

Page 84: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

37

parasitic nematodes). Under this hypothesis, the patchy presence / absence pattern observed in

eukaryotes would be explained by the lack of sampling of the eukaryotic biodiversity.

Availability of additional genomes of plant-parasites in the future will allow discriminating

between these two hypotheses. We also noted the presence of 12 cellulose-binding modules of

family CBM2 (usually found in bacteria) and one of family CBM1 (usually found in fungi). A

total of nine CBM2 modules are appended to candidate cellulases of family GH5, while two

are appended to candidate expansins. The only CBM1 module found was not appended to any

known CAZy module.

Table S12 | Best BLAST hits to M. incognita Plant-Cell Wall degrading and

modifying enzymes.

Predicted activity CAZy

family Abundance in

M. incognita

Best hits in CAZy

Cellulase

GH5

21 Tylenchina, Proteobacteria, Cytophaga,

Firmicutes, Coleoptera Insecta

Xylanase

GH5

6 M. incognita, Firmicutes and Gamma

Proteobacteria

Arabinanase

GH43

2 Actynomycetales, Fungi, Gamma

Proteobacteria

Polygalacturonase

GH28

2 M. incognita, Gamma and Beta

Proteobacteria

Pectate Lyase PL3 30 Tylenchida, Actynomycetales, Fungi

Expansin

EXPN

20 Tylenchida, Actynomycetales, Fungi,

Delta proteobacteria

Invertase GH32 2 Rhizobium Proteobacteria

In red: Metazoans, in blue: Bacteria, in green: Fungi. In case hits were found only in M. incognita and no other

Tylenchina, M. incognita is indicated in red.

Apart from the remarkable abundance of plant cell wall-degrading CAZymes detected in M.

incognita and discussed in the main manuscript, we also noticed an additional set of

differences in M. incognita compared to D. melanogaster and C. elegans. The other major

idiosyncrasies in M. incognita were a particular paucity of chitin-related CAZymes, an

expansion of candidate fucosyltransferases and an expansion of candidate trehalose synthases.

Concerning chitin-related CAZymes, we noted a radical reduction in M. incognita of

CAZymes related to chitin binding and degradation (Table S13). M. incognita possesses a

Page 85: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

38

total of only 15 CAZymes potentially involved in chitin degradation and binding, while C.

elegans has 96 such enzymes and D. melanogaster 230. In contrast, these three species all

have exactly the same number of CAZymes putatively involved in chitin synthesis (3).

Chitinases in nematodes may serve as antifungal defense for free-living species or as effectors

for fungivorous nematodes, as well as being involved in eggshell degradation. As M.

incognita spends most of its life cycle within the host plant, it may benefit from the plant’s

barriers and protections against fungi. The overall abundance in D. melanogaster, which is

essentially due to a massive expansion of the CBM14 family (in the binding category), can be

related to the need in insects to perform several molting cycles that include degradation of

their own exoskeleton. Worms molt as well though cuticle is not likely as extensive in relative

mass as insect exoskeleton.

Table S13 | Chitin-related CAZymes

Degradation Binding Synthesis

Family / Species GH18 GH19 GH20 Total CBM14 CBM18 Total GT2a

M. incognita 2 2 4 8 7 0 7 3

C. elegans 38 5 5 48 41 7 48 3

D. melanogaster 12 0 4 16 214 0 214 3

Comparison of the abundance of CAZyme families known to be related to chitin metabolism in M. incognita, C.

elegans and D. melanogaster. a) Processive GT2s.

The third differentiating feature that emerged from the comparison of M. incognita, C.

elegans and D. melanogaster CAZyme sets concerns candidate fucosyltransferases. C.

elegans has been shown to present a broad range of unusual fucosylated structure among its

panel of N-glycosylation modifications compared to other metazoan36, 37

. This is supported by

a more abundant set of candidate fucosyltransferases in C. elegans compared to D.

melanogaster (Table S14). This tendency for a complex and rich fucosylation pattern in

nematodes is further developed in M. incognita, which presents almost twice as many

candidate fucosyltransferases as C. elegans. It has recently been suggested that multi-

fucosylated structures in nematodes could support the parasitic lifestyle37

.

Page 86: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

39

Finally, we also remarked a substantial expansion in M. incognita of candidate trehalose

synthases. This may reflect a particular importance of trehalose metabolism in this nematode

that should be investigated.

Table S14 | Other CAZyme families with substantial expansions in M. incognita.

Candidate Fucosyltransferases Candidate trehalose synthases

Family / Species GT10 GT11 GT23 Total GT20

M. incognita 30 13 18 61 8

C. elegans 5 26 1 32 2

D. melanogaster 5 0 1 6 1

Page 87: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

40

7.2 Proteases

Proteases are proteolytic enzymes involved in the hydrolysis of peptide bonds that display

crucial functions in all living organisms. The completion of large-scale genomic sequencing

programs has made global characterization of degradomes possible (i.e., the complete set of

proteases that are expressed at a specific moment or circumstance by a cell, tissue or

organism38

) in various organisms39, 40

. In nematodes, proteases play essential roles in a broad

range of biological processes, as diverse as molting in the free-living species C. elegans41

and

digestive specificity in blood-feeding parasites42

.

We identified the presence of 339 proteins that are predicted to encode proteases in M.

incognita (i.e., ~ 1.7% of M. incognita genes encode peptidases, Supplementary Methods,

section 8.11). This is comparable to the abundance in Caenorhabditis spp but more than three

times the corresponding number in B. malayi, (Table S15). An additional subset of nine M.

incognita sequences related to aspartic proteases that are embedded in endogenous retroviral

elements were removed from the global analysis. Overall, no substantial variation in the

number of proteins was noticed for each of the five protease families, compared to C. elegans.

However, a more detailed analysis revealed expansion in some protease sub-families in M.

incognita. The two most important of these expansions concerns the C48 and S16 subfamilies

which are discussed in the main paper. Since parasitism genes in root-knot nematodes encode

secreted proteins43

, the observation that 92 proteases in M. incognita are predicted to be

secreted (~ 27.1% of the nematode protease set; Table S15) reinforces the hypothesis that

members of the nematode degradome play a direct role in parasitism.

Page 88: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

41

Table S15 | The degradomes of the nematodes C. elegans, C. briggsae, B.

malayi and M. incognita

Family C. elegans C. briggsae B. malayi M. incognita

Asp 27 (46.3) 26 6 20 (50.0)

Cys 118 (21.6) 100 18 114 (21.1)

Metallo 207 (37.2) 158 51 138 (34.1)

Ser 72 (45.8) 60 20 53 (20.7)

Thr 15 (21.4) 15 2 14 (0)

Total 439 (33.9) 359 97 339 (27.1)

Based on the specificity of their catalytic activity, proteases have been classified into five distinct classes: aspartic

(Asp), cysteine (Cys), metallo (Metallo), serine (Ser) and threonine (Thr) proteases. In C. elegans and M.

incognita, numbers into brackets correspond to the percentages of putative secreted proteases, based on SignalP

analysis. Asp proteases embedded in endogenous retroviral elements have not been taken into account in the

analysis.

Page 89: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

42

7.3 Orthologs to published nematode plant-parasitism genes

Several genes that are suspected to be involved in plant parasitism in nematodes have already

been published in the literature. We searched for homologs of these genes in the genome of

M. incognita via BLAST searches against proteins, assembly and unplaced reads. We also

searched for these genes in the genome of C.elegans and in animal-parasitic nematodes

(APN). Protein accession number and bibliographic references are given for each query used

(Table S16).

Page 90: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

43

Table S16 | Conservation of PPN parasitism genes within nematodes

PPN parasitism genes

Accession + ref. Public name

Species M.inc APN C.ele

SXP/RAL CAB7570144, 45

Gr-sxp-1 G. rostochiensis + + +

SXP/RAL CAB6634144, 45

Gr-ams-1 G. rostochiensis + + +

Expansin CAC8361146

Gr-expb-1 G. rostochiensis + 0 0

VAP* AAK6020947

Hg-vap1 H. glycines + + +

VAP* AAK5511647

Hg-vap2 H. glycines + + +

VAP* AAD0151148

Mi-msp-1 M. incognita + + +

Fatty Acid Retinoid Binding

CAA7047749, 50

Gp-far-1 G. pallida + + +

Glutathione Peroxidase

CAD3852351

Gr-gpx-1 G. rostochiensis + + +

14-3-3 AAL4071952

Mi-14-3-3a M. incognita + + +

14-3-3b AAR8552752

Mi-14-3-3b M. incognita + + +

Calreticulin AAL4072053

Mi-crt-1 M. incognita + + +

Endochitinase AAN1497854

Hg-chi-1 H. glycines + + +

Cysteine Proteases CAD8979555

Mi-cpl-1 M. incognita + + +

RAN-BP-like CAC2184956

Gr-A41 G. rostochiensis 0 0 0

RAN-BP-like CAC2184856

Gr-A18 G. rostochiensis 0 0 0

RAN-BP-like AAV3469857

Gp-rbp-1 G. pallida 0 0 0

Ubiquitin Extension Proteins

AAP3008158

Hs-ubi-1 H. schachtii 0 0 0

Chorismate mutase AAO1957759

Hg-cm-1 H. glycines + 0 0

Chorismate mutase AAD4216360

Mj-cm-1 M. javanica + 0 0

Chorismate mutase CAD2988761

Gp-cm-1 G. pallida + 0 0

CLE AAG2133162

Hg-syv-46 H glycines 0 0 0

NodL-like AW57065563,a

Mi-NodL M. incognita + 0 0

*VAP : Venom allergen-Like protein. Grey boxes and + indicate presence of homologs of the gene. 0 indicates

that no trace of the gene has been found. aEST accession number.

Page 91: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

44

7.4 Pioneer genes

We searched in the genome of M. incognita for genes that were originally identified as

“pioneer” genes by Huang et al.64

in an analysis of ESTs from a gland cell-specific cDNA

library made from M. incognita. Twenty-seven such pioneers were originally defined and we

searched for their presence in the assembly, unplaced reads in ESTs and in the predicted

proteins. All were present in the assembly and unplaced reads but our screening of M.

incognita genome revealed the presence of 11 additional paralogs of these genes. We further

extensively searched for the presence of these genes in other nematodes, in other metazoans

and in the other domains of life. All originally defined pioneers and their additional paralogs

remained specific to Meloidogyne spp. (Table S17). Several of the genes were present in

clusters of very similar genes, supporting previous observations that parasitism genes may be

present in clusters and may have arisen as a result of gene duplication65

. The most notable

example of this was provided by one parasitism gene family with 13 members present mainly

as three clusters in the M. incognita genome. One of the pioneer genes (AY134435, 16D10)

has a motif with similarity to CLAVATA3/ESR-like (CLE) plant signalling peptides66

, but

16D10 has not been demonstrated to function as a CLE peptide similar to the CLE peptides

found in another PPN62

. The 16D10 gene has been used as a target for a plant host-derived

RNAi based control strategy against root knot nematodes in A. thaliana67

. No other genes

with significant similarity to this gene were identified in the M. incognita genome. Little or no

functional redundancy may therefore be present for this parasitism gene product, perhaps

explaining why it is such an efficacious target for RNAi based control.

Page 92: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

45

Table S17 | Presence of pioneer genes in M. incognita genome and in other

species.

Protein accession number

M. incognita genome (+

new copies) Other PPN C. elegans

Other species

Pfam domain

InterPro domain

AF531161 + (+1) M. javanica, M. arenaria

0 0 PF01549 IPR03582

AF531164 + (+1) 0 0 0 0 0 AF531165, AY134432, AY134442, AY134437, AF531160, AY134433, AY134431, AF531166, AY134434, AY134438, AY142119*

+ (+2)* 0 0 0 0 0

AF531167 + 0 0 0 0 0 AF531168 + 0 0 0 0 0 AF531169 + 0 0 0 0 0

AY134435 (16D10) + 0 0 0 0 0 AY134436 + (+1) 0 0 0 0 0 AY134439 + 0 0 0 0 0

AY134441 + (+2) M. javanica, M. arenaria

0 0 0 0

AY134443 + 0 0 0 0 0 AY134444 + (+1) M. arenaria 0 0 0 0 AY135363 + (+1) 0 0 0 0 0 AY142118 + 0 0 0 0 0 AY142120 + 0 0 0 0 0

AY142121 & AY142116

+ (+2) 0 0 0 0 0

Accession number of each original pioneer gene is given as well as the presence of this gene in the genome of M.

incognita, in other Plant-parasitic nematodes (PPNs), in C. elegans and in other species. Grey boxes and +

indicate presence of homologs of the gene. 0 indicates that no trace of the gene has been found. In case new

paralogs of a given pioneer genes were found in the genome, the number of additional copies are indicated in

parenthesis. * These eleven pioneer genes are classified as a single one-species OthoMCL cluster, and they are

thus considered as paralogs. Two additional paralogs were found in the genome.

Page 93: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

46

7.5 Gene families involved in protection against environmental stresses

Antioxidant enzymes protect aerobic organisms from endogenous and exogenous cytotoxic

oxygen radicals. There were three superoxide dismutase (SOD) genes in M. incognita, two

copies (% ID = 97%) of the gene encoding the copper-zinc enzyme and one copy encoding

the iron-manganese enzyme (Supplementary Methods, section 8.12). For comparison, an

expanded SOD family can be found in C. elegans with three genes encoding the Cu-Zn

enzyme (sod-1, sod-4 and sod-5) and two genes encoding the Mn-Fe enzyme (sod-2 and sod-

3). Like C. elegans, M. incognita had three catalase genes. However, two of these genes

encoded very similar products (%ID >95%) and could be derived from two copies of the same

gene. In contrast to C. elegans where at least six different GPX genes have been annotated,

there were two copies of a single gene type in M. incognita (%ID = 99%). The glutathione

synthetase family included two copies of two genes in M. incognita. Only a single copy gene

is found in C. elegans. Two copies of a single copper chaperonin gene were found in M.

incognita. A single gene can be also found in C. elegans (Table S18).

Glutathione S-transferases (GSTs) are multifunctional proteins essential for xenobiotic

metabolism and protection against peroxidative damage68

. The GST superfamily can be

divided into several structurally and functionally distinct classes that show unique variations

among different phylogenetic group. In C. elegans, 44 members have been identified which

belong to the Omega, Sigma and Zeta classes69

. In M. incognita, we identified five members

(Table S18 and Supplementary Methods, section 8.13) that belong to the Sigma class (and

one GST-like similar to one GST from B. malayi). This strong variation in relative abundance

between C .elegans and M. incognita has been similarly observed with B. malayi. In addition,

the only conserved class found in M. incognita, the Sigma class, seems to be involved in

protection against oxidants rather than xenobiotics. This suggests that the life-style of the

nematodes may have exerted selective pressure on these detoxification genes. The parasitic

nature of M. incognita may be in part responsible for the extremely low number and the

specialization of GST genes since these nematodes are in close relation with their host and

less confronted to the variety of biotic and abiotic environmental stresses than free-living

nematodes.

Page 94: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

47

Table S18 | Putative detoxification genes in M. incognita.

Function Mode of action / Family

name Number of genes in

M. incognita

Number of genes in C. elegans as referred to public

databases

Antioxidant catalase 3 3

Peroxiredoxin 7 3

Superoxide dismutase 3 5

Copper chaperonin 2 1

Glutathione peroxidase 2 6

Glutathione synthetase 4 1

glutathione-S-transferase

GST class sigma and sigma-like

5 9

GST class omega 0 4

GST class zeta 0 2

GST other classes 0 29

cytochrome P450

CYP13 6 14

CYP23 1 1

CYP25-like 1 6

CYP31 2 4

CYP32 3 1

CYP33-like 11 17

CYP42 2 1

partial CYP 1 NA

The cytochromes P450 (CYPs) constitute a large and ancient family of heme thiolate

monooxygenase proteins that catalyze the oxidative metabolism of a vast array of compounds,

including both endogenous substrates such as steroids, bile acids, fatty acids, and

prostaglandins, and exogenous substrates including many pollutants. There were at least 27

full or partial CYP genes in the M. incognita genome, divided among at least eight families

(Supplementary Methods, section 8.14). This number is a substantial reduction in CYP

genes relative to C. elegans, which has 80 different cyp genes divided among 16 families70

.

The parasitic nature of M. incognita may be in part responsible for the lower number of cyp

Page 95: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

48

genes, particularly those involved in the xenobiotic response. The CYPs in C. elegans have

evolved a broad variety of functions in order to cope with the large variety of environmental

conditions faced by this free-living nematode. In particular C. elegans CYP35 genes are

responsive to a variety of xenobiotic stressors70-72

, and a number of other CYPs have been

shown via microarray to be induced by PCB5273

, including members of families CYP13,

CYP14, CYP25, CYP29, CYP33, CYP34, and CYP37. M. incognita possess six CYP13

genes and 11 CYP33-like genes, but does not have any genes homologous to CYP35,

implying a reduced number of xenobiotic-metabolizing P450s relative to C. elegans and C.

briggsae.

Although information on the immune defences of plant parasitic nematodes is scarce74, 75

,

innate immune effectors and signalling pathways are better characterised in C. elegans76, 77

.

We therefore searched in the genome of M. incognita for genes that could be involved in

innate immunity (Supplementary Methods, section 8.15). In the case of the insulin, TGF-

beta, ERK, p38 MAPK and Toll signalling pathways, they were conserved enough to allow us

to identify orthologs in M. incognita (Table S19 - separate file). One exception was trf-1, the

C. elegans homolog of TNF receptor-associated factor 3, a gene potentially involved in the

Toll signalling pathway78

. In contrast to the conservation of the genes encoding components

of these signalling pathways, entire classes of innate immune effectors were absent from the

genome of M. incognita. These included antibacterial genes such as abf and spp79

(6 and 23

genes in C. elegans respectively). In the case of lysozymes, which are induced by bacterial

infection80

, only three genes were identified, compared to 10 in C. elegans. Similarly, C-type

lectin genes appear to be much less abundant in M. incognita as compared to C. elegans (56

genes and 277 genes, respectively). Additionally, no homologs were identified for known or

putative antifungal genes of several classes (nlp, cnc, fip, fipr)81, 82

. This striking difference

between the conservation of signalling molecules and the lack of defence genes has at least

two possible explanations. Firstly, M. incognita may have lost many genes as it lives much of

its life in a privileged environment, protected by the host plant’s defences. As the known

signalling pathways involved in C. elegans innate immunity all have important developmental

roles77

, their presence in M. incognita may not reflect a conserved function in defence.

Alternatively, the pathways could still be important for host immunity, but control effectors

that are not conserved between the two nematode species.

Page 96: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

49

7.6 Nuclear Receptors

Our analysis indicated that at least 14 supplementary NRs were one-to-one orthologs between

B. malayi, M. incognita and C. elegans or only between M. incognita and C. elegans, with

secondary losses in B. malayi, or between C. elegans and B. malayi with losses in M.

incognita (Figure S13, Table S20- separate file and Supplementary Methods, section 8.16).

Figure S13 | Nematode nuclear receptors. Both the usual Caenorhabditis name and the official

nomenclature name of the proteins are given. For each nuclear receptor, a coloured box indicates its

presence/absence in the genome for each of the three species. For HNF4, details of the relationships between

them are given in the last two rows.

Page 97: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

50

7.7 Kinome

Kinases are key regulatory enzymes that control many aspects of physiology and

biochemistry. M. incognita had 499 predicted kinases, which were grouped into 232

orthoMCL groups based on sequence similarity (Supplementary Methods, section 8.17). C.

elegans has 411 predicted kinases83

, B. malayi 21528

, and D. melanogaster 23984

. Of the

predicted 232 M. incognita kinase orthoMCL groups, 158 were predicted to be orthologous to

kinases from C. elegans and 152 had orthologs in both C. briggsae and B. malayi.

Interestingly, 24 M. incognita kinase orthoMCL groups contained only orthologs from these

three species, suggesting that they have nematode-specific functions. A total of four M.

incognita kinase orthoMCL groups contained only orthologs from B. malayi, suggesting a

potential role for these genes in parasitism. Finally, 66 kinase orthoMCL groups containing

122 genes appear to be M. incognita-specific. Detailed results of our analysis of the M.

incognita kinome are available in Table S21 (separate file).

Page 98: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

C. elegans M. incognita C. briggsae B. malayiD.

melanogasterC. elegans M. incognita C. briggsae B. malayi

D. melanogaster

C. elegans M. incognita C. briggsae B. malayiD.

melanogaster

lys- 5 + + 0 0 cnc-8 0 0 0 0age-1 + + + + lys-6 0 + 0 0 cnc-9 0 0 0 0pdk-1 + + + + lys-7 + 0 0 0 cnc-10 0 0 0 0akt-1 + + + + lys-8 + + + 0 cnc-11 0 0 0 0akt-2 + + + + lys-9 0 0 0 0 grsp-1 0 0 0 0

daf-16 + + + + lys-10 0 0 0 0 grsp-2 0 0 0 0daf-2 + + + + spp-1 0 + 0 0 grsp-3 0 0 0 0

spp-2 0 + 0 0 grsp-4 0 0 0 0spp-3 0 + 0 0 �p-1 0 0 0 0

dbl-1 + + + + spp-4 0 + 0 0 �p-2 0 + 0 0sma-2 + + + + spp-5 0 + 0 0 �p-3 0 0 0 0sma-3 + + + + spp-6 0 0 0 0 �p-4 0 0 0 0sma-4 + + + + spp-7 0 + 0 0 �p-5 0 0 0 0

spp-8 0 + 0 0 �p-6 0 0 0 0spp-9 0 + 0 0 �p-7 0 0 0 0

spp-10 + + + 0 �pr-1 0 + 0 0lin-45 + + + + spp-11 0 + 0 0 �pr-2 0 0 0 0mek-2 + + + + spp-12 0 + 0 0 �pr-3 0 0 0 0mpk-1 + + + + spp-13 0 + 0 0 �pr-4 0 0 0 0

spp-14 0 0 0 0 �pr-5 0 0 0 0spp-15 0 + 0 0 �pr-6 0 0 0 0spp-16 0 + 0 0 �pr-7 0 0 0 0

nsy-1 + + + + spp-17 0 + 0 0 �pr-8 0 0 0 0pmk-1 + + + + spp-18 0 + 0 0 �pr-9 0 0 0 0sek-1 + + + + spp-19 0 + 0 0 �pr-10 0 0 0 0tir-1 + + + + spp-20 0 0 0 0 �pr-11 0 0 0 0

spp-21 0 + 0 0 �pr-12 0 0 0 0spp-22 0 0 0 0 �pr-13 0 0 0 0

tol-1 + + 0 + spp-23 0 + 0 0 �pr-14 0 0 0 0trf-1 0 + 0 + �pr-15 0 0 0 0ikb-1 + + + 0 �pr-16 0 0 0 0pik-1 + + + + nlp-27 0 0 0 0 �pr-17 0 0 0 0

nlp-28 0 0 0 0 �pr-18 0 0 0 0nlp-29 0 0 0 0 �pr-19 0 0 0 0

abf-1 0 0 0 0 nlp-30 0 0 0 0 �pr-20 0 0 0 0abf-2 0 + 0 0 nlp-31 0 0 0 0 �pr-21 0 0 0 0abf-3 0 0 0 0 nlp-34 0 0 0 0 �pr-22 0 0 0 0abf-4 0 0 0 0 cnc-1 0 0 0 0 �pr-23 0 0 0 0abf-5 0 + 0 0 cnc-2 0 0 0 0 �pr-24 0 0 0 0abf-6 0 + 0 0 cnc-3 0 0 0 0 �pr-25 0 0 0 0lys-1 0 + 0 0 cnc-4 0 0 0 0 �pr-26 0 0 0 0lys-2 0 + 0 0 cnc-5 0 0 0 0 �pr-27 0 0 0 0lys-3 0 + 0 0 cnc-6 0 0 0 0 �pr-28 0 0 0 0lys-4 0 + 0 0 cnc-7 0 0 0 0 �pr-29 0 0 0 0

Antifungal

signaling pathway

signaling pathway

signaling pathwa yInsulin (DAF-2/DAF-16)

TGF-beta (DBL-1) signaling pathway

ERK MAPK

p38 MAPK

Antibacterial

Toll signaling pathway

lagnufitnAlairetcabitnA

Supplementary Table S19Comparison of immune response genes in C. elegans and M. incognita

Page 99: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

Table Nuclear Receptor. Group C. elegans B. malayi M. incognita D. melanogaster H. sapiens

0A sprinKB-RHNmB7-rdoKNRLEagle

0B DAX1SHP

1A TRaTRb

1B RARaRARbRARg

1C PPARaPPARbPPARg

1D abreveR57E11RHNmB58-rhnReverbb

1E+G 87E1-xesCNRD

1F HR3 BmNHR13 Minc10028 DHR3 RORabROR38330cniM

RORg1H RcE3RHNmB LXRA

LXRbFXRa

1I VDRPXRCAR

1J+K DAF-12 BmNHR17 Minc18589 DHR96NHR-8 BmNHR31 Minc13296

NHR-482A supnrs supnrs supnrs HNF4 HNF4a

HNF4g2B aRXRPSU4RHNmB

RXRbRXRg

2C TR2TR4

2D 87RHD5RHNmB14-RHN2E NHR-67 BmNHR15 Minc12751 TLL TLL

FAX-1 BmNHR16 Minc02801 PNR PNRDSF

FAX-12F aFT-PUOCPVS52RHNmB55-cnu

COUP-TFbEAR2

3A ERaERb

3B ERR ERRaERRbERRg

3C GRMRPRAR

4A BIFGN83RHD8RNCNURR1NOR1

SF1LRH1

5B DHR396A FNCGFRG12RHNmBFNCG

SupNRs 58151cniM5-rhnSupNRs 52710cniM7-rhnSupNRs nhr-14 BmNHR22 Minc11307SupNRs nhr-31 BmNHR10SupNRs 83511cniM23-rhnSupNRs 83571cniM53-rhNSupNRs nhr-40 BmNHR24SupNRs nhr-49 BmNHR18 Minc02316SupNRs 81320cniM46-rhnSupNRs nhr-88 BmNHR19 Minc15420SupNRs 68911cniM79-rhnSupNRs 52310cniM501-rhnSupNRs 91461cniM701-rhnSupNRs 95051cniM772-rhn

841229srnpus + 41latoT

Data from C. elegans, C. briggsae, D. melanogaster, C. intestinalis, H. sapiens are from Bertrand et al. (2004). Data and nomenclature from B. malayi are from Ghedin et al. (2007). For Meloidogyne, only receptors for which clear orthology relationships with other known receptors are indicated in the table.

FTZ-F15A FTZ-F1 BmNHR14

Supplementary Table S20Nuclear receptors

Page 100: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

Supplementary Table 21 | Kinome

Gene Model

Accession Number

Assembly

Version

Family

Name

Activity

DescriptionDescription EC Number

protMinc12770 MINCV1A1 Kinase Kinase

protMinc01872 MINCV1A1 Kinase Kinase

protMinc04997a MINCV1A1 Kinase Kinase

protMinc04997b MINCV1A1 Kinase Kinase

protMinc04997c MINCV1A1 Kinase Kinase

protMinc08002 MINCV1A1 Kinase Kinase

protMinc18378a MINCV1A1 Kinase Kinase

protMinc18378b MINCV1A1 Kinase Kinase

protMinc16587 MINCV1A1 Kinase Kinase

protMinc03881a MINCV1A1 Kinase Kinase

protMinc08283a MINCV1A1 Kinase Kinase

protMinc13804 MINCV1A1 Kinase Kinase

protMinc16542 MINCV1A1 Kinase Kinase

protMinc07540 MINCV1A1 Kinase Kinase

protMinc15324a MINCV1A1 Kinase Kinase

protMinc15324b MINCV1A1 Kinase Kinase

protMinc15324d MINCV1A1 Kinase Kinase

protMinc17753 MINCV1A1 Kinase Kinase

protMinc15054 MINCV1A1 Kinase Kinase

protMinc03915a MINCV1A1 Kinase Kinase

protMinc03915b MINCV1A1 Kinase Kinase

protMinc03915c MINCV1A1 Kinase Kinase

protMinc15430 MINCV1A1 Kinase Kinase

protMinc15683 MINCV1A1 Kinase Kinase

protMinc03002 MINCV1A1 Kinase Kinase

protMinc03003 MINCV1A1 Kinase Kinase

protMinc06140 MINCV1A1 Kinase Kinase

protMinc12074 MINCV1A1 Kinase Kinase

protMinc16594 MINCV1A1 Kinase Kinase

protMinc00176 MINCV1A1 Kinase Kinase

Page 101: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

protMinc03498 MINCV1A1 Kinase Kinase

protMinc12608 MINCV1A1 Kinase Kinase

protMinc14680 MINCV1A1 Kinase Kinase

protMinc15293 MINCV1A1 Kinase Kinase

protMinc15374 MINCV1A1 Kinase Kinase

protMinc16085 MINCV1A1 Kinase Kinase

protMinc17039 MINCV1A1 Kinase Kinase

protMinc17056 MINCV1A1 Kinase Kinase

protMinc17317 MINCV1A1 Kinase Kinase

protMinc17349c MINCV1A1 Kinase Kinase

protMinc18139 MINCV1A1 Kinase Kinase

protMinc01077 MINCV1A1 Kinase Kinase

protMinc03268 MINCV1A1 Kinase Kinase

protMinc08988 MINCV1A1 Kinase Kinase

protMinc08449 MINCV1A1 Kinase Kinase

protMinc09596a MINCV1A1 Kinase Kinase

protMinc09596b MINCV1A1 Kinase Kinase

protMinc12562 MINCV1A1 Kinase Kinase

protMinc15331 MINCV1A1 Kinase Kinase

protMinc15332b MINCV1A1 Kinase Kinase

protMinc15332e MINCV1A1 Kinase Kinase

protMinc15433a MINCV1A1 Kinase Kinase

protMinc15433b MINCV1A1 Kinase Kinase

protMinc03691 MINCV1A1 Kinase Kinase

protMinc07788 MINCV1A1 Kinase Kinase

protMinc08219 MINCV1A1 Kinase Kinase

protMinc12144 MINCV1A1 Kinase Kinase

protMinc12145 MINCV1A1 Kinase Kinase

protMinc12146 MINCV1A1 Kinase Kinase

protMinc12883 MINCV1A1 Kinase Kinase

protMinc14751 MINCV1A1 Kinase Kinase

protMinc17015 MINCV1A1 Kinase Kinase

protMinc18363 MINCV1A1 Kinase Kinase

Page 102: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

protMinc18536 MINCV1A1 Kinase Kinase

protMinc12540 MINCV1A1 Kinase Kinase

protMinc17623 MINCV1A1 Kinase Kinase

protMinc03530 MINCV1A1 Kinase Kinase

protMinc05986 MINCV1A1 Kinase Kinase

protMinc08669 MINCV1A1 Kinase Kinase

protMinc08670 MINCV1A1 Kinase Kinase

protMinc09572 MINCV1A1 Kinase Kinase

protMinc11456 MINCV1A1 Kinase Kinase

protMinc19094 MINCV1A1 Kinase Kinase

protMinc06568 MINCV1A1 Kinase Kinase

protMinc06736 MINCV1A1 Kinase Kinase

protMinc12051 MINCV1A1 Kinase Kinase

protMinc10720 MINCV1A1 Kinase Kinase

protMinc11633 MINCV1A1 Kinase Kinase

protMinc06280 MINCV1A1 Kinase Kinase

protMinc12621 MINCV1A1 Kinase Kinase

protMinc00343 MINCV1A1 Kinase Kinase

protMinc04583 MINCV1A1 Kinase Kinase

protMinc06579 MINCV1A1 Kinase Kinase

protMinc06746 MINCV1A1 Kinase Kinase

protMinc02295 MINCV1A1 Kinase Kinase Protein kinase-like

protMinc04016 MINCV1A1 Kinase Kinase

protMinc06187 MINCV1A1 Kinase Kinase

protMinc04931a MINCV1A1 Kinase Kinase

protMinc04931c MINCV1A1 Kinase Kinase

protMinc04931e MINCV1A1 Kinase Kinase

protMinc12703a MINCV1A1 Kinase Kinase

protMinc12703b MINCV1A1 Kinase Kinase

protMinc12703d MINCV1A1 Kinase Kinase

protMinc07273 MINCV1A1 Kinase Kinase

protMinc13672 MINCV1A1 Kinase Kinase

protMinc03376 MINCV1A1 Kinase Kinase

Page 103: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

protMinc06801 MINCV1A1 Kinase Kinase

protMinc10020 MINCV1A1 Kinase Kinase

protMinc15314 MINCV1A1 Kinase Kinase

protMinc13344 MINCV1A1 Kinase Kinase

protMinc13475 MINCV1A1 Kinase Kinase

protMinc15886 MINCV1A1 Kinase Kinase

protMinc17922 MINCV1A1 Kinase Kinase

protMinc09153 MINCV1A1 Kinase Kinase

protMinc09154 MINCV1A1 Kinase Kinase

protMinc09156 MINCV1A1 Kinase Kinase

protMinc00750 MINCV1A1 Kinase Kinase

protMinc01474 MINCV1A1 Kinase Kinase

protMinc09328 MINCV1A1 Kinase Kinase

protMinc15737 MINCV1A1 Kinase Kinase

protMinc18479 MINCV1A1 Kinase Kinase

protMinc09327 MINCV1A1 Kinase Kinase

protMinc16984 MINCV1A1 Kinase Kinase

protMinc03854 MINCV1A1 Kinase Kinase

protMinc01198 MINCV1A1 Kinase Kinase

protMinc06514 MINCV1A1 Kinase Kinase

protMinc18717 MINCV1A1 Kinase Kinase

protMinc15744 MINCV1A1 Kinase Kinase

protMinc15461 MINCV1A1 Kinase Kinase

protMinc00611 MINCV1A1 Kinase Kinase

protMinc03777 MINCV1A1 Kinase Kinase

protMinc00610 MINCV1A1 Kinase Kinase

protMinc03778 MINCV1A1 Kinase Kinase

protMinc12157 MINCV1A1 Kinase Kinase

protMinc17408 MINCV1A1 Kinase Kinase

protMinc17409 MINCV1A1 Kinase Kinase

protMinc17770 MINCV1A1 Kinase Kinase

protMinc03435a MINCV1A1 Kinase Kinase

protMinc05143a MINCV1A1 Kinase Kinase

Page 104: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

protMinc15232 MINCV1A1 Kinase Kinase

protMinc04069 MINCV1A1 Kinase Kinase

protMinc02762a MINCV1A1 Kinase Kinase

protMinc14392 MINCV1A1 Kinase Kinase

protMinc14396 MINCV1A1 Kinase Kinase

protMinc14890 MINCV1A1 Kinase Kinase

protMinc14891a MINCV1A1 Kinase Kinase

protMinc14891b MINCV1A1 Kinase Kinase

protMinc01266 MINCV1A1 Kinase Kinase

protMinc04139 MINCV1A1 Kinase Kinase

protMinc16016 MINCV1A1 Kinase Kinase

protMinc08018 MINCV1A1 Kinase Kinase

protMinc00773 MINCV1A1 Kinase Kinase

protMinc01451 MINCV1A1 Kinase Kinase

protMinc01835 MINCV1A1 Kinase Kinase

protMinc05609 MINCV1A1 Kinase Kinase

protMinc05610 MINCV1A1 Kinase Kinase

protMinc01894 MINCV1A1 Kinase Kinase

protMinc04376a MINCV1A1 Kinase Kinase

protMinc04401 MINCV1A1 Kinase Kinase

protMinc16132 MINCV1A1 Kinase Kinase

protMinc09022 MINCV1A1 Kinase Kinase

protMinc03671 MINCV1A1 Kinase Kinase

protMinc08395 MINCV1A1 Kinase Kinase

protMinc10867 MINCV1A1 Kinase Kinase

protMinc10869 MINCV1A1 Kinase Kinase

protMinc11373 MINCV1A1 Kinase Kinase

protMinc14888 MINCV1A1 Kinase Kinase

protMinc03641 MINCV1A1 Kinase Kinase

protMinc04643 MINCV1A1 Kinase Kinase

protMinc15719 MINCV1A1 Kinase Kinase

protMinc07847 MINCV1A1 Kinase Kinase

protMinc15082 MINCV1A1 Kinase Kinase

Page 105: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

protMinc07852 MINCV1A1 Kinase Kinase

protMinc06664a MINCV1A1 Kinase Kinase

protMinc06664b MINCV1A1 Kinase Kinase

protMinc06664c MINCV1A1 Kinase Kinase

protMinc03649 MINCV1A1 Kinase Kinase

protMinc04635 MINCV1A1 Kinase Kinase

protMinc09472 MINCV1A1 Kinase Kinase

protMinc01370 MINCV1A1 Kinase Kinase

protMinc04359 MINCV1A1 Kinase Kinase

protMinc10977 MINCV1A1 Kinase Kinase

protMinc02904 MINCV1A1 Kinase Kinase

protMinc08064 MINCV1A1 Kinase Kinase

protMinc03666 MINCV1A1 Kinase Kinase

protMinc05711 MINCV1A1 Kinase Kinase

protMinc03081 MINCV1A1 Kinase Kinase

protMinc05629 MINCV1A1 Kinase Kinase

protMinc02163 MINCV1A1 Kinase Kinase

protMinc08604 MINCV1A1 Kinase Kinase

protMinc17946 MINCV1A1 Kinase Kinase

protMinc04926a MINCV1A1 Kinase Kinase

protMinc04926b MINCV1A1 Kinase Kinase

protMinc01307 MINCV1A1 Kinase Kinase

protMinc04561 MINCV1A1 Kinase Kinase

protMinc07404 MINCV1A1 Kinase Kinase

protMinc00736 MINCV1A1 Kinase Kinase

protMinc04884a MINCV1A1 Kinase Kinase

protMinc04884b MINCV1A1 Kinase Kinase

protMinc17774 MINCV1A1 Kinase Kinase

protMinc03015 MINCV1A1 Kinase Kinase

protMinc06361 MINCV1A1 Kinase Kinase

protMinc01265a MINCV1A1 Kinase Kinase

protMinc01265b MINCV1A1 Kinase Kinase

protMinc04140 MINCV1A1 Kinase Kinase

Page 106: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

protMinc17985 MINCV1A1 Kinase Kinase

protMinc11713 MINCV1A1 Kinase Kinase

protMinc11010 MINCV1A1 Kinase Kinase

protMinc15400 MINCV1A1 Kinase Kinase

protMinc01610 MINCV1A1 Kinase Kinase

protMinc03453 MINCV1A1 Kinase Kinase

protMinc00980 MINCV1A1 Kinase Kinase

protMinc02430 MINCV1A1 Kinase Kinase

protMinc18052 MINCV1A1 Kinase Kinase

protMinc04806 MINCV1A1 Kinase Kinase

protMinc10619 MINCV1A1 Kinase Kinase

protMinc14515 MINCV1A1 Kinase Kinase

prot:Minc09187a MINCV1A1 Kinase Kinase Snf1-like protein kinase 2.7.11.1

prot:Minc09187c MINCV1A1 Kinase Kinase Snf1-like protein kinase 2.7.11.1

protMinc10490 MINCV1A1 Kinase Kinase

protMinc11994 MINCV1A1 Kinase Kinase

protMinc16458 MINCV1A1 Kinase Kinase

protMinc17804 MINCV1A1 Kinase Kinase

protMinc17805 MINCV1A1 Kinase Kinase

prot:Minc04534 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.17

prot:Minc01696 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.17

protMinc14305a MINCV1A1 Kinase Kinase

protMinc14305b MINCV1A1 Kinase Kinase

protMinc13886 MINCV1A1 Kinase Kinase

protMinc06802 MINCV1A1 Kinase Kinase

protMinc03375 MINCV1A1 Kinase Kinase

protMinc12096 MINCV1A1 Kinase Kinase

protMinc06879 MINCV1A1 Kinase Kinase

protMinc00717 MINCV1A1 Kinase Kinase

protMinc13678 MINCV1A1 Kinase Kinase

protMinc04425 MINCV1A1 Kinase Kinase

protMinc13682 MINCV1A1 Kinase Kinase

protMinc12578 MINCV1A1 Kinase Kinase

Page 107: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

protMinc15404 MINCV1A1 Kinase Kinase

protMinc00188 MINCV1A1 Kinase Kinase

protMinc03301 MINCV1A1 Kinase Kinase

protMinc10311 MINCV1A1 Kinase Kinase

protMinc04166 MINCV1A1 Kinase Kinase

protMinc09811 MINCV1A1 Kinase Kinase

protMinc08927 MINCV1A1 Kinase Kinase

protMinc17580 MINCV1A1 Kinase Kinase

protMinc00310a MINCV1A1 Kinase Kinase

protMinc00310b MINCV1A1 Kinase Kinase

protMinc00310c MINCV1A1 Kinase Kinase

protMinc02719 MINCV1A1 Kinase Kinase

protMinc04964 MINCV1A1 Kinase Kinase

protMinc13814 MINCV1A1 Kinase Kinase

protMinc06930 MINCV1A1 Kinase Kinase

protMinc15999 MINCV1A1 Kinase Kinase

protMinc11187 MINCV1A1 Kinase Kinase

protMinc02792 MINCV1A1 Kinase Kinase

protMinc06083 MINCV1A1 Kinase Kinase

protMinc07957 MINCV1A1 Kinase Kinase

protMinc09908 MINCV1A1 Kinase Kinase

protMinc14168 MINCV1A1 Kinase Kinase

protMinc01214 MINCV1A1 Kinase Kinase

protMinc11438 MINCV1A1 Kinase Kinase

protMinc09797 MINCV1A1 Kinase Kinase

protMinc09798 MINCV1A1 Kinase Kinase

protMinc04422 MINCV1A1 Kinase Kinase

protMinc13679 MINCV1A1 Kinase Kinase

protMinc17545 MINCV1A1 Kinase Kinase

protMinc17546 MINCV1A1 Kinase Kinase

protMinc04017 MINCV1A1 Kinase Kinase

protMinc06364 MINCV1A1 Kinase Kinase

protMinc13168 MINCV1A1 Kinase Kinase

Page 108: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

protMinc04263 MINCV1A1 Kinase Kinase

protMinc03013 MINCV1A1 Kinase Kinase

protMinc04220a MINCV1A1 Kinase Kinase

protMinc04220b MINCV1A1 Kinase Kinase

protMinc03850 MINCV1A1 Kinase Kinase

protMinc07871 MINCV1A1 Kinase Kinase

protMinc08494 MINCV1A1 Kinase Kinase

protMinc12835 MINCV1A1 Kinase Kinase

protMinc07361 MINCV1A1 Kinase Kinase

protMinc00307 MINCV1A1 Kinase Kinase

protMinc14127 MINCV1A1 Kinase Kinase

prot:Minc13952 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc13820 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc14466 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc14471 MINCV1A1 Kinase Kinase Protein kinase-like

protMinc04976 MINCV1A1 Kinase Kinase

protMinc01219 MINCV1A1 Kinase Kinase

protMinc11432 MINCV1A1 Kinase Kinase

protMinc10314 MINCV1A1 Kinase Kinase

protMinc00909 MINCV1A1 Kinase Kinase

protMinc06565 MINCV1A1 Kinase Kinase

protMinc06733 MINCV1A1 Kinase Kinase

protMinc01390 MINCV1A1 Kinase Kinase

protMinc04380 MINCV1A1 Kinase Kinase

protMinc03164 MINCV1A1 Kinase Kinase

protMinc03192 MINCV1A1 Kinase Kinase

protMinc05801 MINCV1A1 Kinase Kinase

protMinc12407 MINCV1A1 Kinase Kinase

protMinc14870 MINCV1A1 Kinase Kinase

protMinc17699 MINCV1A1 Kinase Kinase

protMinc18603 MINCV1A1 Kinase Kinase

protMinc11744 MINCV1A1 Kinase Kinase

protMinc16584 MINCV1A1 Kinase Kinase

Page 109: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

protMinc06574 MINCV1A1 Kinase Kinase

protMinc06743 MINCV1A1 Kinase Kinase

protMinc11064 MINCV1A1 Kinase Kinase

protMinc13207 MINCV1A1 Kinase Kinase

protMinc02156 MINCV1A1 Kinase Kinase

protMinc08595 MINCV1A1 Kinase Kinase

protMinc13865 MINCV1A1 Kinase Kinase

protMinc13388 MINCV1A1 Kinase Kinase

protMinc00779 MINCV1A1 Kinase Kinase

prot:Minc17734 MINCV1A1 Kinase Kinase Hexokinase 2.7.1.1

prot:Minc19166 MINCV1A1 Kinase Kinase Hexokinase 2.7.1.1

protMinc13660 MINCV1A1 Kinase Kinase

protMinc09808 MINCV1A1 Kinase Kinase

protMinc02889 MINCV1A1 Kinase Kinase

protMinc02152 MINCV1A1 Kinase Kinase

protMinc01308 MINCV1A1 Kinase Kinase

prot:Minc01720 MINCV1A1 Kinase Kinase

Protein kinase C, phorbol ester/diacylglycerol binding; Rho

GTPase activation protein

prot:Minc17481b MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.17

prot:Minc17481a MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.17

prot:Minc15199a MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.17

prot:Minc12148 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc12151 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

protMinc14880 MINCV1A1 Kinase Kinase

protMinc17057 MINCV1A1 Kinase Kinase

prot:Minc07137 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.22

prot:Minc00155 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.12.1

prot:Minc10362 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.22

prot:Minc00175 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc12646 MINCV1A1 Kinase Kinase Hexokinase 2.7.1.1

prot:Minc00426 MINCV1A1 Kinase Kinase Hexokinase 2.7.1.1

prot:Minc00958 MINCV1A1 Kinase Kinase Serine/threonine protein kinase 2.7.11.1

prot:Minc08081 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

Page 110: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

prot:Minc10407 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc10408 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc12040 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc12046 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc12305 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc12308 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc13053 MINCV1A1 Kinase Kinase SKP1 component

prot:Minc01193 MINCV1A1 Kinase Kinase SKP1 component

prot:Minc01617 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc03457 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc01620 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc03462 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc18208 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc16533 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc18185 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc02365 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc19060 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc13981 MINCV1A1 Kinase Kinase Receptor protein tyrosine kinase Axl-related

prot:Minc02867 MINCV1A1 Kinase Kinase Immunoglobulin-like fold

prot:Minc09963 MINCV1A1 Kinase Kinase Immunoglobulin-like fold

prot:Minc14377 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc14370 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc02910 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc08062 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc03527 MINCV1A1 Kinase Kinase Serine/threonine protein kinase 2.7.11.1

prot:Minc04870 MINCV1A1 Kinase Kinase Serine/threonine protein kinase 2.7.11.1

prot:Minc15546 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc15547b MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc17856 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc12406 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc18973b MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc12860b MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc18973a MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

Page 111: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

prot:Minc12860a MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc04294 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc12227 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc03836 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc07888 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc18109 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc08112 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc11811 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc10242 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc16517 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc15555 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc13900 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc15554 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc08116 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc08115 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc13904 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc06201 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc18094 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc13905 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc13403 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc06209 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc13283 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc13404 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc07608 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc06203 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc13584 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc06208 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc16945 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc13402 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc13906 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc13285 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc13286 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc13698 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

Page 112: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

prot:Minc13284 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc06206 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc16779 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc12202 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc13407 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc16944 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc16781 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc17424 MINCV1A1 Kinase Kinase Mitogen activated protein kinase kinase kinase 4 2.7.11.1

prot:Minc06202 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc09173 MINCV1A1 Kinase Kinase cAMP-dependent protein kinase inhibitor

prot:Minc03212 MINCV1A1 Kinase Kinase cAMP-dependent protein kinase inhibitor

prot:Minc10362 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.22

prot:Minc00155 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.12.1

prot:Minc12283 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc13041 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc15172 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc12509 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc07338 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc13244 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc13345 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc13476/prot:Minc13477MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc13907 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc13401 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc18542 MINCV1A1 Kinase Kinase Serine/threonine protein kinase, striated muscle

prot:Minc10845 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc11864 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc11865 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc13005/prot:Minc13006MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc14375 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc02915 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc08049 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc14368 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc14252 MINCV1A1 Kinase Kinase Serine/threonine protein kinase 2.7.11.1

Page 113: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

prot:Minc00376 MINCV1A1 Kinase Kinase Serine/threonine protein kinase 2.7.11.1

prot:Minc10912 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc16963 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc03781 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc11720 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc11871 MINCV1A1 Kinase Kinase SKP1 component

prot:Minc19197 MINCV1A1 Kinase Kinase BTB/POZ fold

prot:Minc07339 MINCV1A1 Kinase Kinase SKP1 component

prot:Minc13245 MINCV1A1 Kinase Kinase SKP1 component

prot:Minc06758 MINCV1A1 Kinase Kinase SKP1 component

prot:Minc06587 MINCV1A1 Kinase Kinase SKP1 component

prot:Minc03310 MINCV1A1 Kinase Kinase

prot:Minc08411 MINCV1A1 Kinase Kinase

prot:Minc13178 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc06912 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc06914 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc06918 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc15895 MINCV1A1 Kinase Kinase Interleukin-1 receptor

prot:Minc14703 MINCV1A1 Kinase Kinase Myosin Light Chain Kinase

prot:Minc06737 MINCV1A1 Kinase Kinase Calcium/calmodulin-dependent protein kinase 1

prot:Minc06569 MINCV1A1 Kinase Kinase Calcium/calmodulin-dependent protein kinase 1

prot:Minc12098 MINCV1A1 Kinase Kinase Protein kinase, core

prot:Minc13262 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc13399 MINCV1A1 Kinase Kinase Tyrosine protein kinase

prot:Minc13560 MINCV1A1 Kinase Kinase Cyclin-dependent kinase inhibitor

prot:Minc13861 MINCV1A1 Kinase Kinase Ankyrin

prot:Minc13997 MINCV1A1 Kinase Kinase PDZ/DHR/GLGF, Fcf2 pre-rRNA processing

prot:Minc13998 MINCV1A1 Kinase Kinase Serine/threonine protein kinase

prot:Minc09923 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc09924 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc09989a/b MINCV1A1 Kinase Kinase Casein kinase II, regulatory subunit

prot:Minc10200 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc10599 MINCV1A1 Kinase Kinase Protein kinase-like

Page 114: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

prot:Minc10978 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc14637 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc09752 MINCV1A1 Kinase Kinase Cyclin-dependent kinase inhibitor

prot:Minc09922 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.23

prot:Minc15813 MINCV1A1 Kinase Kinase ATP:guanido phosphotransferase

prot:Minc15884 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc16658 MINCV1A1 Kinase Kinase SANT, DNA-binding, MAP Kinase, p38

prot:Minc16837 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc17366 MINCV1A1 Kinase Kinase Serine/threonine protein kinase, striated muscle

prot:Minc17389 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc17848 MINCV1A1 Kinase Kinase ADP-specific phosphofructokinase/glucokinase

prot:Minc17964 MINCV1A1 Kinase Kinase Citron-like

prot:Minc18407 MINCV1A1 Kinase Kinase Dephospho-CoA kinase

prot:Minc18640 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc18771 MINCV1A1 Kinase Kinase SKP1 component

prot:Minc00860 MINCV1A1 Kinase Kinase Receptor protein tyrosine kinase Axl-related

prot:Minc10135 MINCV1A1 Kinase Kinase Receptor protein tyrosine kinase Axl-related

prot:Minc00424 MINCV1A1 Kinase Kinase Hexokinase

prot:Minc01901 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc02770 MINCV1A1 Kinase Kinase Glycerate kinase 2.7.1.31

prot:Minc07402 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc07854 MINCV1A1 Kinase Kinase Protein kinase-like 2.7.11.1

prot:Minc08001 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc05213 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc05702 MINCV1A1 Kinase Kinase Inositol polyphosphate kinase

prot:Minc05996 MINCV1A1 Kinase Kinase Prefoldin

prot:Minc09451b MINCV1A1 Kinase Kinase WIF domain

prot:Minc05611 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc01836 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc05775 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc06566 MINCV1A1 Kinase Kinase Protein Kinase-1, 3-phosphoinositide dependent

prot:Minc09010 MINCV1A1 Kinase Kinase Ribokinase

prot:Minc07403 MINCV1A1 Kinase Kinase Protein kinase-like

Page 115: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

prot:Minc03586 MINCV1A1 Kinase Kinase Adenylate kinase 2.7.4.3

prot:Minc02266 MINCV1A1 Kinase Kinase protein kinase-like

prot:Minc01387a MINCV1A1 Kinase Kinase Protein kinase C, delta/epsilon/eta/theta types

prot:Minc01397 MINCV1A1 Kinase Kinase Protein kinase-like

prot:Minc04386 MINCV1A1 Kinase Kinase protein kinase-like

prot:Minc18546 MINCV1A1 Kinase Kinase Protein kinase, C-terminal

prot:Minc06733 MINCV1A1 Kinase Kinase Protein kinase, C-terminal 2.7.11.1

prot:Minc18078 MINCV1A1 Kinase Kinase

Diacylglycerol kinase, catalytic region; Protein kinase C, phorbol

ester/diacylglycerol binding; EF-Hand type

prot:Minc16534 MINCV1A1 Kinase Kinase protein kinase-like

prot:Minc15814 MINCV1A1 Kinase Kinase ATP:guanido phosphotransferase 2.7.3.2

Page 116: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

51

7.8 Chemosensory G Protein-Coupled Receptors (GPCRs)

Candidate GPCR genes were classified into superfamilies according to C. elegans

classification (Table S22 and Supplementary Methods, section 8.18) and named starting

with sr for serpentine receptor and as follows: Str (seven TM receptor), Srg (sr class g), Sra

(sr class a) and Solo (family probably distantly related to sr). M. incognita possesses far fewer

chemosensory genes than does C. elegans or even C. briggsae. Nevertheless, M. incognita

putative chemosensory GPCR genes showed a pattern similar to C. elegans in which GPCR

genes were found as clusters of duplicated genes85, 86

(Fig. S14). We found 11 clusters of M.

incognita serpentine receptor genes with a maximum of six genes clustered together.

Although the M. incognita putative chemosensory GPCR genes were related to C. elegans

genes, this does not imply that the identified genes play a chemosensory role in M. incognita

biology. Even in C. elegans, analysis of the function of chemoreceptors is difficult due to

genetic redundancy among closely related receptors.

Table S22 | Putative serpentine receptor genes found in the M. incognita

genome.

Family C. elegans C. briggsae M. incognita.

STR 722 146 0

SRG 382 123 41

SRA 141 67 21

Solo - srsx 40 21 46

Solo - srw 145 13 0

Solo - srz 105 5 0

Solo - srbc 84 5 0

Solo - srr 10 6 0

Page 117: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

52

Predicted genes

C. elegans contigT21H8

M. incognita contig MiV1ctg187

Figure S14 | Chemosensory: Example of GPCR genes found in clusters.

C. elegans sra-37, sra-38 and sra-39 are found in a cluster on contig T21H8 (Genbank Z78546) similarly to five

M. incognita Sra orthologs on contig MiV1ctg187.

Page 118: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

53

7.9 Neuropeptides

Results of BLAST search suggest that neuropeptide complement of M. incognita is reduced

compared to that of C. elegans (19 flp genes and 21 nlp genes readily identifiable in the M.

incognita genome; Table S23 and Supplementary Methods, section 8.19). However,

additional search may identify more distantly related sequences.

Page 119: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

54

Table S23 | Neuropeptides, comparison of flp and nlp gene complements of C.

elegans and M. incognita.

nlp gene* C. elegans M. incognita flp gene C. elegans M. incognita

1 + + 1 + + 2 + + 2 + 0 3 + + 3 + + 4 + 0 4 + 0 5 + 0 5 + + 6 + + 6 + + 7 + 0 7 + + 8 + + 8 + 0 9 + + 9 + 0 10 + + 10 + 0 11 + 0 11 + 0 12 + + 12 + + 13 + + 13 + + 14 + + 14 + + 15 + + 15 + 0 16 + 0 16 + + 17 + + 17 + 0 18 + + 18 + + 19 + 0 19 + + 20 + 0 20 + + 21 + + 21 + + 22 + + 22 + + 23 + 0 23 + 0 34 + 0 24 + 0 35 + 0 25 + + 36 + + 26 + 0 37 + + 27 + + 38 + + 28 + 0 39 + 0 29 0 0 40 + + 30 0 + 41 + 0 31 0 + 42 + + 32 + + 43 + 0 44 + + 45 + 0

Grey box and + indicate gene present. . 0 indicates absence of traces of this gene.*nlp-24 to 33 were excluded

because they encode antimicrobial peptides expressed in the epidermis, not in the nervous system.

Page 120: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

55

7.10 Sex determination

Identifying homologs of many sex determination genes is difficult as they are frequently

highly diverged. For example, the amino acid identity between the C. elegans and C. briggsae

sdc-2 and sdc-3 genes is just 32% and 28% respectively. In other cases the functional proteins

may be specific members of large multigene families in which the other members are of

unrelated function and genetic evidence may be required to prove function. In spite of this it

was possible to identify putative M. incognita homologs of several C. elegans sex

determination pathway genes using a combination of reciprocal best hit analysis and/or

phylogenetic analysis of gene families (Table S24).

Page 121: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

56

Table S24 | Comparison of the sex determination pathway genes of C. elegans

and M. incognita.

C. elegans gene

Role Putative M. incognita homolog(s)

sex-1 Dosage counting Not possible to determine a single likely

homolog.

fox-1 Dosage counting Not detected.

Usually conserved in other nematodes.

xol-1 Integrator of X and autosomal dosage Not detected. Highly divergent in other

nematodes. sdc-1 Dosage compensation 2

sdc-2 Dosage compensation Not detected.

Highly divergent in other nematodes.

sdc-3 Dosage compensation Not detected.

Highly divergent in other nematodes.

her-1 Secreted co-coordinator protein Not detected.

Usually conserved in other nematodes.

tra-2 Receptor for her-1 Not possible to determine a single likely

homolog. tra-3 Regulation of tra-2 processing 1

fem-1 Cytoplasmic responder to TRA-2 Not possible to determine a single likely

homolog. fem-2 Cytoplasmic responder to TRA-2 1

fem-3 Cytoplasmic responder to TRA-2 Not detected.

Highly divergent in other nematodes. mog-1 Repressor of FEM-3 2 mog-4 Repressor of FEM-3 1 mog-5 Repressor of FEM-3 1 mog-6 Repressor of FEM-3 3 tra-1 Global regulator of sex-specific transcription 2

fog-1 Promoter of spermatogenesis Not detected.

Usually conserved in other nematodes.

fog-3 Promoter of spermatogenesis Not detected.

Usually conserved in other nematodes.

mab-3 Regulator of male tail and neuron

development Not possible to determine a single likely

homolog

gld-3 Regulation of sperm/oocyte and

mitosis/meiosis decisions Not detected.

mag-1 Germline repressor of male promoting

genes 2

mab-23 Male differentiation and behaviour 2

cpb-1 Executioner of spermatogenesis in cells

identified by fog-1 1

Grey box indicates gene present.

Page 122: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

57

7.11 RNAi genes

Genes of the C. elegans RNAi pathway were identified in the M. incognita genome. Genes

found are presented in Table S25. However, no homologs of sid-1, sid-2, rsd-2 and rsd-6,

genes involved in systemic RNAi and dsRNA spreading to surrounding cells, were found as

also observed in B. malayi.

Table S25 | Comparison of RNAi pathway genes in M. incognita and B. malayi.

C. elegans public name M. incognita B. malayi

Dicer complex

dcr-1 + +

drh-1, drh-2 + +

rde-1 + + rde-2 0 0 rde-3 + 0

rde-4 0 +

rde-5 + +

RISC complex

alg-1 + +

alg-2 + +

dFXR + +

vig-1 0 +

tsn-1 0 +

RdRp amplification complex

ego-1, rrf-1, rrf-2 + +

rrf-3 + +

Systemic RNAi (spreading)

rsd-2 0 0

rsd-3 + +

rsd-6 0 0

sid-1 0 0

sid-2 0 0

Cleavage of primary miRNA transcripts

drsh-1 + +

Required for RNAi

zfp-1 0 +

smg-2 + +

smg-5 0 0

mes-8 0 +

mes-3 0 0

mut-16 0 0

gfl-1 + 0

Grey box and + indicate gene present. . 0 indicates absence of traces of this gene.

Page 123: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

58

7.12 Orthologs of C. elegans lethal RNAi genes

We searched for M. incognita genes orthologous to C. elegans genes yielding lethal

phenotypes in RNAi experiments. We then specifically filtered genes that were not found in

taxa other than Nematoda. The rationale was to find Nematoda-restricted genes whose

inactivations are lethal. Such genes could represent interesting target for the development of

more specific nematicides.

We screened the 2,958 C. elegans genes returning lethal phenotype in RNAi experiments

(Supplementary Methods, section 8.20) against the 8-species OrthoMCL clusters generated

during the automatic annotation process (Supplementary Methods, section 8.9). This search

returned 1,909 OrthoMCL clusters, of which 1,083 contained a M. incognita predicted

ortholog.

A total of 491 clusters contained all the 8 taxa and probably represent core essential genes

common to all eukaryotes.

• 148 clusters contained 7 taxa.

• 25 clusters contained 6 taxa.

• 247 clusters contained 5 taxa.

• 131 clusters contained 4 taxa (108 of these are nematode-only clusters: C. elegans +

C. briggsae + M. incognita + B. malayi)

• 39 clusters contained 3 taxa

o 3 clusters contained M. incognita + C. elegans + B. malayi

o 35 clusters contained M. incognita + C. elegans +C. briggsae

o 1 cluster contained M. incognita + C. elegans + Dmel

• 2 clusters contained 2 taxa: M. incognita + C. elegans

A total of 148 clusters (108 + 3 + 35 +2) contain nematode-restricted genes whose C. elegans

orthologs show lethal phenotypes in RNAi inactivation experiments (Main Manuscript Fig.

4b). In these 148 clusters, 344 M. incognita proteins were found (231 in the 4-taxa MCLs,

106 in the 3-taxa MCLs containing C.elegans and C. briggsaei, 3 in the 3-taxa MCLs

containing C. elegans and B. malayi, 4 in the 2-taxa MCLs).

Page 124: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

59

8 Supplementary Methods

8.1 Biological material

8.1.1 Nematode strain and DNA preparation

The M. incognita population used for sequencing was originally collected in Morelos, Mexico

by H. Jarquin-Barberena and deposited into the RKN collection at INRA Sophia Antipolis,

France, in 1990. To eliminate any potential within-population heterogeneity, a line was raised

from the original field population starting from the progeny of a single female as follows.

Single females were carefully hand-dissected from the root tissues with their own egg-mass,

which was then used to reinoculate a tomato, Solanum esculentum (cv. Saint Pierre). Because

of the mitotic parthenogenetic mode of reproduction of M. incognita, such a progeny can be

considered as a clonal line. Nematodes were maintained on tomatoes grown at 20°C in a

greenhouse. The high molecular weight DNA used for sequencing was prepared from

nematode eggs. Eggs were collected from infested roots after treatment of the egg masses in

0.5% NaOCl and rinsed with distilled water. They were further concentrated by centrifugation

at 2,000 g for 2 min in a 30% sucrose solution at 4°C, washed in distilled water, pelleted in a

microcentrifuge and stored at -80°C until DNA purification. Template DNA was purified

using the phenol/chloroform method87

, spectrophotometrically quantified, aliquoted and

stored at -80°C until use.

8.1.2 EST resources

To improve support and accuracy of automated gene prediction, we generated a specific

resource of about 40,000 ESTs from two development stages (infective 2nd-stage juvenile and

eggs) which complement the 18,907 M. incognita ESTs from NCBI-dbEST88, 89

and

additional ESTs generated at INRA Sophia-Antipolis laboratory. After base-calling and

trimming, a total of 47,377 EST sequences were validated, including 17,162 ESTs from the J2

library and 6,790 ESTs from the egg library. After clustering (TIGR-TGICL90

), these 47,377

ESTs represent 11,644 "unique" sequences: 5,943 contigs and 5,701 singlets.

Page 125: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

60

8.2 Genome sequencing and assembly

Paired-end sequences were obtained from plasmid and BAC libraries91

(Table S26) with

Sanger dideoxynucleotide technology on ABI3730xl DNA Analysers. We checked for

possible contamination by plant DNA using UCSC BLAT program92

with Tomato ESTs as

queries. This search returned no significant hit.

Table S26 | Origin of the sequence reads used in the assembly.

Libraries Vector Mean insert size Reads

A High-copy plasmid 3 kb 570,601

B Low-copy plasmid 10 kb 395,044

C BAC 83 kb 8,293

D BAC 25 kb 26,935

8.3 Detection of scaffold pairs and triplets

Initially an all-against-all comparison of M. incognita predicted proteins was performed using

Smith-Waterman algorithm and alignments with an e-value lower than 10-10

being retained.

Clusters of homologous genes have been generated accordingly. A single linkage clustering

with a euclidian distance was used to group genes. The distances were calculated using the

gene index in each scaffold rather than the genomic position. The maximal distance between

two pairs of homologous genes on two different scaffolds was set to eight. We only retained

clusters that were composed of at least three genes showing homologs with conserved synteny

on two corresponding scaffolds. We found 1,473 clusters of homologous genes, containing

6,068 predicted genes and spanning about 54Mb. In the assembly, there are only 765

supercontigs (spanning approximately 55 Mb) that meet these criteria.

We then aligned all paired supercontigs at the nucleotide level using Blast293

. The observed

similarities and substitution rates (Fig. S3 and S4) confirmed the extensive shared homolog

content between 648 supercontigs. This procedure could not assign any allelic relationship

between small supercontigs, as they do not contain sufficient information to construct

clusters.

We found about 3.35 Mb of the assembly that corresponded to a third copy aligning with two

previously identified allelic supercontigs. We estimated the total size of regions present in

three independent copies by analysing the unassembled reads. All reads were aligned to the

supercontigs using Blast2. Matches were retained if they were at least 200 bases long and had

Page 126: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

61

≥80% nucleotide identity. When more than 25 reads matched a couple of allelic supercontigs,

but were not included in the assembly, we concluded that a third version was present, but not

assembled. This enabled us to identify about 2.4 Mb of haploid genome sequence that may

represent a third copy of some sequence segments in addition to the 3.35 Mb previously

identified in the supercontigs. So, the total size of the genome that displays a third allelic

version may be estimated as 3.35 + 2.4 = 5.75 Mb, or about 11.5% of genome size, if we

estimate the haploid genome as being 50 Mb12

.

8.4 Detection of repetitive elements

Repetitive and Transposable elements (TE) were first predicted “ab initio” (i) beginning with

an all-by-all genome comparison with BLASTER94

using BLASTN93

, (ii) then grouped with

GROUPER94

, RECON95

and PILER96

using default parameters. Consensus sequences were

built with MAP97

and classified according to BLASTER matches using TBLASTX and

BLASTX with the entire Repbase98

as reference data bank, and according to the presence of

any terminal repeats (LTR, TIR or polyA tail). For example, a consensus is defined as MITE

if (i) it carries TIRs; (ii) it doesn't match via tBlastx or Blastx with known TEs; (iii) its length

without its TIRs is lower than 500bp. The consensus set was then analyzed by an all-by-all

BLASTER procedure to remove redundancies, i.e. when a consensus sequence is included

into another at a 95% identity threshold and 98% length threshold.

Then, repeats were annotated using an improved version of the TE annotation pipeline

described by Quesneville et al94

. Briefly, this pipeline is composed of (i) the TE detection

softwares BLASTER, RepeatMasker and Censor99

; (ii) the satellite detection softwares

RepeatMasker100

, TRF101

and Mreps102

. To save computer time and reduce software memory

requirements, we segmented the genomic sequences into chunks of 200 kb overlapping by 10

kb. Each chunk was then independently analyzed by the different programs. Simple repeats

were used to filter out spurious hits. TE or repeat copies less than 20 bp after removing simple

repeat regions were discarded.

To take into account the fact that TEs often insert within other pre-existing TEs leading to

fragmented segments of a same copy, a specific “long join” annotation procedure has been

performed, using age estimates of repeat fragments. Indeed the identity percentage between a

fragment and its reference TE/repeat consensus can be used to estimate the age of this

fragment. Consecutive fragments on both the genome and the same reference repeat

consensus were automatically joined if their identity percentage difference was less than 2%

Page 127: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

62

(i.e. the two fragments had approximately the same age) and (i) if they were separated by a

gap of less than 5,000 bp and/or by a mismatch region of less than 500 nucleotides, or (ii) if

there were nested repeats (e.g. the fragments were separated by a sequence of which more

than 95% consisted of other younger repeat insertions, all inserts having a higher identity

compared to their respective consensus). Fragments separated by more than 100kb were not

joined. At the end, nested repeats were split if inner repeat fragments are longer than outer

joined fragments.

8.5 Detection of non-coding RNAs

The annotation of ncRNA molecules was performed using the LeARN103 annotation platform.

Four methods were used in the detection pipeline: (i) tRNAScan-SE104 for transfer RNA

(tRNA) gene detection (ii) NCBI-BLASTN versus a ribosomal RNA (rRNA) sequence

database for large ribosomal sequences identification (iii) the Rfam105 database release 8.1 to

detect common ncRNA families and (iv) a mirfold-based106 pipeline using the mirBase17

library as a source of microRNAs (miRNA) candidates.

8.6 Splice Leaders (SL) annotation and detection

We searched for SL1 genes using the LeARN103

platform and BLASTN search against M.

incognita contig sequences. We selected the BLAST hits that showed at least 19/22

nucleotide identity. The conserved SL sequences and the 100-nt downstream regions were

aligned with Multalin107

and clustalw108

to define homology groups. The secondary structure

of Mi-SLl genes was predicted using free energy minimization and the software RNA

Shapes109

. Spliced leaders on ESTs were identified by BLASTN searches against M.

incognita EST from NCBI (dbEST release January 25, 2008) excluding cDNA libraries built

with an SL1 amplification primer.

8.7 Detection of Operons

The set of C. elegans operons was downloaded from WormBase 110

through WormMart

(release 185). The B. malayi genome annotations28

were downloaded from Wormbase (as

genome feature format [GFF] files). We used M. incognita GFF files describing the predicted

features in the genome, including the 19,212 protein coding genes. In addition, for M.

incognita, we also included orthology information with other nematode genes from the

complete set of OrthoMCL111

clusters defined in the automatic annotation process

(Supplementary Methods, section 8.9). These data were parsed into a custom relational

Page 128: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

63

database that permitted calculation of genic spacing, and correlated genes from different taxa

through their MCL ortholog memberships.

In the absence of a specific SL tag to identify M. incognita operons, the genes were surveyed

for spacing. In order to avoid false-positives, consisting of potentially mis-predicted

consecutive gene models, we required a minimum spacing of 25 bases between two gene

models. We therefore defined operons as containing genes separated by <1001 bases, and

counted spacings <25 bases as ‘intragenic breaks’. Additionally, it was not possible to

determine unequivocally spacings for the 5,713 genes at the ends of contigs, and these genes

were therefore designated the first genes in operons where they had <1001 bases between

them and the next gene downstream regardless of the upstream spacing to the end of the

contig. These M. incognita operons have been designated MIOP#### where the # indicates an

index number, and a list of operons and their membership is given in Table S4 (separate file).

8.8 Gene model predictions

Gene model predictions were produced using the integrative gene prediction platform

EuGene112

. To evaluate the accuracy of the produced automatic predictions, a reference

dataset of 230 M. incognita protein-coding gene models was built from EST libraries by

isolating full-length cDNA. After manual curation, all the gene model structures retained in

the reference set are supported by the alignment of their genomic sequence with the

corresponding full-length transcript sequence.

Statistical models of DNA composition (Interpolated Markov Models) for M. incognita were

trained using similarities with SwissProt113

and Wormpep114

databases and spliced

alignments. Regions intersecting the reference dataset were removed. The EuGene combiner

was then used to build a consensus annotation integrating both the statistical models and the

evidence below:

• Translation starts and splice site predictions provided by the SpliceMachine

software115

. Splice site models were trained on M. incognita using sites from spliced

alignments of genomic contigs to ESTs of M. incognita, produced using

Genomethreader116

. Translation start models were trained using regions of high

similarity with the N-terminal region of SwissProt and Wormpep proteins. Sites

appearing in the reference dataset were removed from the training set.

Page 129: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

64

• Spliced alignments of M. incognita ESTs and corresponding tentative consensus

aligned using Genomethreader116

.

• Protein similarities with the curated protein database SwissProt (Nov 2007) and the

reference database Wormpep (release 180, Oct 2007) detected using NCBI-

BLASTX93

.

• Sequence conservation with available nematode genomes (C. remanei, C. briggsae, C.

elegans, B. malayi) computed using NCBI-TBLASTX.

• Alignments of nematode EST sequences (excluding M. incognita ESTs) produced by

NCBI-TBLASTX.

• Repeats identified by RepeatMasker searches against Repbase98

and the M. incognita

repeat database were used for soft masking to prevent the prediction of gene models

matching transposable element sequences.

Splice variants were predicted each time EST/mRNA evidence of different isoforms was

observed (inconsistent splicing pattern in different spliced transcripts), each predicted isoform

being constrained to follow one of the splicing patterns observed112

. All results are

summarized in Table S6.

8.8.1 Accuracy assessment

In a first step, the accuracy of the pipeline was assessed on the 230 gene models from the built

reference dataset (Fig. S15). Genes and exons are considered as correctly predicted when all

coordinates are precisely identified. Depending on the different types of evidence supporting

the predicted gene model, the accuracy at the level of complete gene structure ranged from

57% sensitivity in the absence of any similarity to 73.5% when the locus was supported by

ESTs, protein similarities and other sequences conservations. Similarly, the exon sensitivity

ranged from 73.2% to 89.3%.

Page 130: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

65

Figure S15 | Performance of the EuGene gene prediction pipeline against the

reference models. Sn and Sp G/E/N stands for Gene/Exon and Nucleotide level sensitivities and

specificities respectively. The TBX, BX and EST tags indicate the type of evidence integrated for the prediction

(TBX stands for NCBI-TBLASTX, BX for NCBI-BLASTX and EST for GenomeThreader EST spliced alignments).

Spliced alignments of 47,377 ESTs on the genome were generated using Genomethreader116

(alignment with at least 95% identity and alignment of more than 80% of EST length). Out of

47,377 ESTs, 30,837 (65%) had a valid alignment: 18,556 ESTs (39%) had only one

match/hit, 10,810 ESTs (22,9%) had two hits on two different contigs, and 1,471 ESTs (3.1%)

had more than two hits on different contigs.

8.9 Automatic functional annotation

For comparative analysis of Interpro domains found in the predicted M. incognita proteome,

we ran the program InterproScan on a set of seven different other species: three nematodes

(C. elegans [WormPep183], C. briggsae [rel2] and B. malayi [rel1]), one insect (D.

melanogaster [rel5.4]) and three fungi (M. grisea, G. zea and N. crassa). We attributed IPR

domains, assigned associated Gene Ontology117

terms (Fig. S11), and identified protein with

signal peptide using SignalP118, 119

as embedded in Interproscan (Table S8). To homogenize

the granularity level of annotation between organisms, for each non-overlapping set of

50,0

60,0

70,0

80,0

90,0

100,0

Sn G 57,0 60,0 63,5 73,5

Sp G 56,2 59,0 62,4 73,2

Sn E 73,2 76,5 78,9 89,3

Sp E 93,3 92,9 93,2 94,5

Sn N 77,4 79,7 82,0 92,9

Sp N 98,3 97,9 98,1 99,6

MiEGN (ab-initio) MiEGN-TBX MiEGN-TBX-BX MiEGN-TBX-BX-

EST

Page 131: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

66

domains found, we only kept the root domain. We used the hierarchical organization of

domains proposed in the “Parent-Child” description available on the EBI public ftp server

(ftp://ftp.ebi.ac.uk/pub/databases/interpro/ParentChildTreeFile.txt). For example, all CYP

proteins which have a P450 domain (IPR002949, IPR002397, IPR008070 …) were counted at

their root domain (IPR001128) (Table S9).

We also generated clusters of orthologous protein-coding genes between M. incognita and the

seven other species listed previously using OrthoMCL111

. Clusters were constructed on the

basis of multidirectional reciprocal best Blast93

hits with a MCL clustering algorithm. This

method allows clustering of candidate orthologs between different species but also includes

in-paralogs (resulting from species specific-duplications) inside clusters. Thus, OrthoMCL

clusters can range from one-species clusters (only composed of species-specific in-paralogs)

to eight-species clusters (representing genes shared between all the species considered here).

8.10 Detection and annotation of CAZymes

8.10.1 Detection of CAZymes and modular annotation

CAZymes are characterized by their specific catalytic modules (GHs for Glycoside

Hydrolases and transglycosidase, GTs for GlycosylTransferases, PLs for Polysaccharide

Lyases and CEs for Carbohydrate Esterases), occasionally associated to carbohydrate-binding

modules (CBMs). Those modules were searched in M. incognita predicted proteins using the

same routines than for the daily updates of the Carbohydrate-Active enZymes (CAZy)

database (http://www.cazy.org) and used in refs120-122

. In CAZy, protein sequences are cut

into their constitutive modules (catalytic modules, CBMs and other non-catalytic modules or

domains of unknown function like expansins). Each is assigned to a class and family

according to its similarity to manually created groups correlating with a same 3D fold. The

resulting fragments are assembled in families of module sequences and formatted as BLAST93

libraries. Each protein model from M. incognita was scanned (using BLASTp) against

libraries of more than 150,000 individual modules using a database size parameter identical to

that of the NCBI’s nr (non-redundant) database. Models that returned an e-value smaller than

0.1, were automatically sorted on an intermediate report and manually analyzed. Manual

analysis involved examination of the alignment of the model with the various members of

each family (whether of catalytic or non-catalytic modules), with a search of the conserved

signatures/motifs characteristic of each family. The presence of the catalytic machinery was

verified for distant relatives whenever known in the family. The models that displayed

Page 132: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

67

significant similarities and respected our criteria were kept for functional annotation and

classified in the appropriate classes and families. Our modular annotation also allowed to

identify truncated alignment with CAZyme modules or other problems probably due to

classical automatic gene prediction errors. The ensemble of modular annotations as well as

possible associated gene model errors was uploaded to the publicly available M. incognita

genome database.

8.10.2 Functional annotation of detected CAZymes

As for the majority of new genome sequences released, most protein models from M.

incognita have no associated experimentally assessed biochemical data. Thus all protein

models were systematically compared against CAZy family members and specifically against

biochemically characterized examples. This strategy allows a straightforward activity

prediction for protein models with close similarity to experimentally characterized CAZymes

from nematodes. Nevertheless, functional prediction is more difficult for large families

lacking biochemical characterization in nematodes. The degree of predictability of enzyme

action will then vary from very precise activity predictions, to very wide descriptors,

dependent on more remote relationships to characterized cases. Typically, we followed an

annotation gradient ranging from candidate “enzyme activity” when the similarity was close,

to “distantly related to enzyme activity” when similarity was more remote. We assigned to M.

incognita candidate CAZymes an Enzyme Committee (EC) number only in cases where

100% identity to an experimentally characterized CAZymes was observed. All functional

annotations were uploaded on the M. incognita publicly available database.

8.10.3 Comparison of CAZyme repertoires

We compared the abundance and distribution (in classes and families) of the whole set of

CAZymes encoded by M. incognita to those of D. melanogaster and C. elegans. When we

identified major differences in abundance in a given family, we further investigated the family

in M. incognita and compared its composition to those of three Fungi, two plant-pathogens,

Magnaporthe grisea123

and Gibberella zea124

, one saprobe, Neurospora crassa125

. When

CAZyme families were found to be present in M. incognita but missing both in C. elegans

and D. melanogaster, we checked for their pattern of presence / absence in all domains of life

through the CAZy database and systematically BLASTed M. incognita representatives against

all CAZymes (including bacteria and viruses). This allowed the identification of CAZymes

that may have been acquired via horizontal gene transfer (HGT) in M. incognita. Similarly,

Page 133: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

68

we also carefully examined cases where a CAZyme family was present both in C. elegans and

D. melanogaster but not detected in M. incognita. To assess whether this could be related to

gene loss in M. incognita, we extensively checked for the presence of corresponding CAZyme

modules using tBLASTn searches against the assembled genome as well as unplaced reads.

8.10.4 Search for homologs of M. incognita PCWD enzymes

Each of the 61 identified candidate PCWD enzymes as well as the 20 candidate expansins and

two candidate invertases were searched by BLAST against the publicly available protein

database NR at NCBI and against the CAZy database. For each BLAST search, the catalytic

module was used as a query and taxa corresponding to the closest BLAST hits were

systematically reported in Table S12. In most cases these enzymes were already known in

PPN and best BLAST hits were in Tylenchida in these cases. In two instances, no hits in

Metazoa were found except in M. incognita (polygalacturonases and xylanases). Additionally,

in two cases (GH43 and GH32), the enzymes were not previously identified in PPN

(including in M. incognita) and best hits were bacterial.

8.11 Annotation of the proteases set

Predicted M. incognita proteins resulting from the automatic annotation process were scanned

for conserved protease motifs/domains available in the InterPro33, 34

database, release 16.1.

Predicted proteins containing such motifs were manually inspected and compared to

annotated proteases from C. elegans, C. briggsae (Wormbase110

database, release WS185, and

B. malayi (Brugia malayi genome annotation database at TIGR web server

http://www.tigr.org/tdb/e2k1/bma/), in order to distinguish between true proteases and false

positives.

The MEROPS database, release 8.0126

was used to classify M. incognita putative proteases

into families and sub-families. Putative secreted proteases were identified using SignalP118

.

Page 134: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

69

8.12 Antioxidant enzymes

Manual annotation was conducted to fix gene splits, gene fusions and other prediction errors

with alignment-based evidence using BLAST at NCBI and WormBase against available

genomic and EST sequences.

8.13 Glutathione-S-transferases

Orthologs of C. elegans GST genes were first identified by orthoMCL111

. Additional protein

sequences with IPR004045 and IPR004046 were retrieved from C. elegans and insects.

Theses sequences were used in BLAST searches against the M. incognita genome and EST

datasets. Manual annotation was conducted based on multiple sequence alignments and

phylogenetic analysis.

8.14 Cytochromes P450

M. incognita CYPs retrieved from the genome and EST libraries were annotated by direct

comparison to the most closely related C. elegans CYP. Estimation of relationships was

facilitated by phylogenetic analysis. In some cases the low percent identity of M. incognita

CYPs to named C. elegans CYPs precluded formal assignment of CYP names; this is

indicated in Table S18.

8.15 Immune response

Orthologs of C. elegans genes were searched for by orthoMCL. Additional homologous genes

were searched for by BLAST searches against the M. incognita genome sequence or predicted

proteins. The identified M. incognita predicted proteins were checked for conservation of the

InterPro domains of the query. Chitinases were identified as described in Supplementary

Methods, section 8.10.

8.16 Detection and annotation of Nuclear Receptors (NRs)

Candidate Nuclear Receptors (NRs) from M. incognita predicted protein sequences were

retrieved by BLASTing the predicted protein database with sequences from each NR

superfamily. The retrieved sequences were incorporated into a dataset containing NR

sequences from all superfamilies from various metazoan groups. Sequences were aligned with

Muscle127, 128

, manually checked and analyzed with phyml129

with the JTT substitution

model130

and rate heterogeneity between sites corrected by a gamma distribution (eight rate

Page 135: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

70

categories). The EST dataset and unplaced reads were also analysed in order to find traces of

any classical NRs that were not found in the prediction dataset.

8.17 Annotation of Kinases

We searched for kinases in the genome of M. incognita, using kinase domains from the Swiss-

Prot database as queries against EuGene predicted proteins. Following this analysis,

OrthoMCL was utilized to identify orthologs of each of the M. incognita genes in the

following seven taxa: C. elegans, C. briggsae, B. malayi, D. melanogaster, M. grisea, G.

zeae, and N. crassa. In addition, BLAST and multalin were employed to manually annotate

all M. incognita-specific kinase genes as well as a selection of conserved kinase genes.

8.18 GPCRs

We downloaded from Wormbase110

all sequences of putative C. elegans and C. briggsae

GPCR protein sequences involved in chemosensory function. All are designated serpentine

receptors (SRs) based on Robertson et al.85

(approximately 1,280 intact genes and

approximately 420 apparent pseudogenes = 7% of C. elegans genes). OrthoMCL groups with

M.incognita genes were sorted based upon C. elegans superfamily classification.

8.19 Neuropeptides

Identification of M. incognita candidate neuropeptide primarily employed the OrthoMCL

clusters generated during the automated functional annotation process. This approach

facilitated identification of only a few candidates. For the most part neuropeptide genes were

identified using BLAST searches against M. incognita predicted proteins, assembled

scaffolds, unplaced reads and ESTs. In the first instance, search queries were constructed

from EST-derived predictions of M. incognita FLP and NLP peptides where available131

.

Where these were unavailable or unsuccessful, predicted C. elegans FLPs and NLPs were

used as search strings132-134

.

8.20 Orthologs of C. elegans lethal RNAi genes

An analysis of the repository of RNAi experiments in Wormbase110

showed that phenotypes

are organized in a complex ontology. The general term describing Lethality in Wormbase is

"Lethal". This term has many children and RNAi experiments are not all annotated at the

same granularity level. This implies that we need to screen for all terms having a direct

linkage with lethality in order to retrieve all experiments leading to an observed morbid

Page 136: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

71

phenotype. Thus, we selected all descriptors either containing the term "Lethal" or a synonym

in their definition and all descriptors having a direct one to one relation (a single parent) with

a “Lethal term”. This screen returned 50 terms of which 48 are non redundant (two terms are

repeated twice).

Screens against Wormbase (using Wormart) with the 48 "lethal terms" returned 2,958 unique

C. elegans gene accession numbers where lethal phenotypes are observed.

Parameters:

• Database used: Wormbase rel. 185

• Dataset used: RNAi

• Filters: Phenotypes annotation includes "only" [Scoring] Observed, Limit to

Phenotype IDs of Type: Pheno WB ID: list of 48 lethal terms.

• Attributes: Gene Public Name, Gene Seq Name.

We scanned these 2,958 accession numbers against OthoMCL111

clusters constructed from the

raw comparative analysis of M. incognita proteins with seven other species.

Page 137: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

72

9 REFERENCES

1. Blok, V.C., Jones, J.T., Phillips, M.S. & Trudgill, D.L. Parasitism genes and host range

disparities in biotrophic nematodes: the conundrum of polyphagy versus specialisation.

Bioessays 30, 249-259 (2008).

2. Trudgill, D.L. Parthenogenetic root-knot nematodes (Meloidogyne spp.); how can these

biotrophic endoparasites have such an enormous host range? Plant Pathology 46, 26-32

(1997).

3. Castagnone-Sereno, P. et al. Phylogenetic relationships between amphimictic and

parthenogenetic nematodes of the genus Meloidogyne as inferred from repetitive DNA

analysis. Heredity 70, 195-204 (1993).

4. Castagnone-Sereno, P., F., V.-M. & F., L. Genetic polymorphism between and within

Meloidogyne species detected with RAPD markers. Genome 37, 904-909 (1994).

5. Scholl, E.H. & Bird, D.M. Resolving tylenchid evolutionary relationships through multiple

gene analysis derived from EST data. Mol Phylogenet Evol 36, 536-545 (2005).

6. Esbenshade, P.R. & Triantaphyllou, A.C. Enzymatic relationships and evolution in the genus

Meloidogyne (Nematoda: Tylenchida). Journal of Nematology 19, 8-18 (1987).

7. Hugall, A., Stanton, J. & Moritz, C. Evolution of the AT-rich mitochondrial DNA of the root

knot nematode, Meloidogyne hapla. Mol Biol Evol 14, 40-48 (1997).

8. Blaxter, M.L. et al. A molecular evolutionary framework for the phylum Nematoda. Nature

392, 71-75 (1998).

9. Triantaphyllou, A.C. in An advance treatise on Meloidogyne, Vol. 1. (eds. J.N. Sasser & C.C.

Carter) 113-126 (North Carolina State University Graphics, Raleigh, USA; 1985).

10. Castagnone-Sereno, P. Genetic variability of nematodes: a threat to the durability of plant

resistance genes? Euphytica 124, 193-199 (2002).

11. Lushai, G., Loxdale, H.D. & Allen, J.A. The dynamic clonal genome and its adaptive

potential. Biological Journal of the Linnean Society 79, 193-208 (2003).

12. Leroy, S., Duperray, C. & Morand, S. Flow cytometry for parasite nematode genome size

measurement. Mol Biochem Parasitol 128, 91-93 (2003).

13. Mark Welch, D.B., Cummings, M.P., Hillis, D.M. & Meselson, M. Divergent gene copies in

the asexual class Bdelloidea (Rotifera) separated before the bdelloid radiation or within

bdelloid families. Proc Natl Acad Sci U S A 101, 1622-1625 (2004).

14. Mark Welch, D.B. & Meselson, M.S. Rates of nucleotide substitution in sexual and anciently

asexual rotifers. Proc Natl Acad Sci U S A 98, 6720-6724 (2001).

15. Lowe, T.M. A Genomic tRNA Database, http://lowelab.ucsc.edu/GtRNAdb. (2005).

16. Mitreva, M. et al. Codon usage patterns in Nematoda: analysis based on over 25 million

codons in thirty-two species. Genome Biol 7, R75 (2006).

17. Griffiths-Jones, S., Saini, H.K., van Dongen, S. & Enright, A.J. miRBase: tools for microRNA

genomics. Nucleic Acids Res 36, D154-158 (2008).

18. Hastings, K.E. SL trans-splicing: easy come or easy go? Trends Genet 21, 240-247 (2005).

19. Guiliano, D.B. & Blaxter, M.L. Operon conservation and the evolution of trans-splicing in the

phylum Nematoda. PLoS Genet 2, e198 (2006).

20. Krause, M. & Hirsh, D. A trans-spliced leader sequence on actin mRNA in C. elegans. Cell

49, 753-761 (1987).

21. Greenbaum, N.L., Radhakrishnan, I., Hirsh, D. & Patel, D.J. Determination of the folding

topology of the SL1 RNA from Caenorhabditis elegans by multidimensional heteronuclear

NMR. J Mol Biol 252, 314-327 (1995).

22. Blumenthal, T. & Gleason, K.S. Caenorhabditis elegans operons: form and function. Nat Rev

Genet 4, 112-120 (2003).

23. Spieth, J., Brooke, G., Kuersten, S., Lea, K. & Blumenthal, T. Operons in C. elegans:

Polycistronic mRNA precursors are processed by trans-splicing of SL2 to downstream coding

regions. Cell 73, 521-532 (1993).

24. Blumenthal, T. et al. A global analysis of Caenorhabditis elegans operons. Nature 417, 851-

854 (2002).

Page 138: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

73

25. Koltai, H., Spiegel, Y. & Blaxter, M.L. Regulated use of an alternative spliced leader exon in

the plant parasitic nematode Meloidogyne javanica. Mol Biochem Parasitol 86, 107-110

(1997).

26. Stein, L.D. et al. The genome sequence of Caenorhabditis briggsae: a platform for

comparative genomics. PLoS Biol 1, E45 (2003).

27. The C.elegans Genome Sequencing Consortium Genome sequence of the nematode C.

elegans: a platform for investigating biology. Science 282, 2012-2018 (1998).

28. Ghedin, E. et al. Draft genome of the filarial nematode parasite Brugia malayi. Science 317,

1756-1760 (2007).

29. Page, A.P. Cyclophilin and protein disulfide isomerase genes are co-transcribed in a

functionally related manner in Caenorhabditis elegans. DNA Cell Biol 16, 1335-1343 (1997).

30. Spieth, J. & Lawson, D. Overview of gene structure. WormBook

doi/10.1895/wormbook.1.65.1, http://www.wormbook.org (2006).

31. Hillier, L.W. et al. Genomics in C. elegans: so many genes, such a little worm. Genome Res

15, 1651-1660 (2005).

32. Li, W. & Godzik, A. Cd-hit: a fast program for clustering and comparing large sets of protein

or nucleotide sequences. Bioinformatics 22, 1658-1659 (2006).

33. Mulder, N.J. et al. New developments in the InterPro database. Nucleic Acids Res 35, D224-

228 (2007).

34. Quevillon, E. et al. InterProScan: protein domains identifier. Nucleic Acids Res 33, W116-120

(2005).

35. Lewis, S.E. et al. Apollo: a sequence annotation editor. Genome Biol 3, RESEARCH0082

(2002).

36. Haslam, S.M. & Dell, A. Hallmarks of Caenorhabditis elegans N-glycosylation: complexity

and controversy. Biochimie 85, 25-32 (2003).

37. Paschinger, K., Gutternigg, M., Rendic, D. & Wilson, I.B. The N-glycosylation pattern of

Caenorhabditis elegans. Carbohydr Res (2007).

38. Lopez-Otin, C. & Overall, C.M. Protease degradomics: a new challenge for proteomics. Nat

Rev Mol Cell Biol 3, 509-519 (2002).

39. Wu, Y., Wang, X., Liu, X. & Wang, Y. Data-mining approaches reveal hidden families of

proteases in the genome of malaria parasite. Genome Res 13, 601-616 (2003).

40. Puente, X.S. & Lopez-Otin, C. A genomic analysis of rat proteases and protease inhibitors.

Genome Res 14, 609-622 (2004).

41. Frand, A.R., Russel, S. & Ruvkun, G. Functional genomic analysis of C. elegans molting.

PLoS Biol 3, e312 (2005).

42. Williamson, A.L., Brindley, P.J., Knox, D.P., Hotez, P.J. & Loukas, A. Digestive proteases of

blood-feeding nematodes. Trends Parasitol 19, 417-423 (2003).

43. Davis, E.L., Hussey, R.S. & Baum, T.J. Getting to the roots of parasitism by nematodes.

Trends Parasitol 20, 134-141 (2004).

44. Jones, J.T., Smant, G. & Blok, V.C. SXP/RAL-2 proteins of the potato cyst nematode

Globodera rostochiensis: secreted proteins of the hypodermis and amphids. Nematology 2,

887-893 (2000).

45. Rao, K.V. et al. The Wuchereria bancrofti orthologue of Brugia malayi SXP1 and the

diagnosis of bancroftian filariasis. Mol Biochem Parasitol 107, 71-80 (2000).

46. Qin, L. et al. Plant degradation: a nematode expansin acting on plants. Nature 427, 30 (2004).

47. Gao, B. et al. Molecular characterisation and expression of two venom allergen-like protein

genes in Heterodera glycines. Int J Parasitol 31, 1617-1625 (2001).

48. Ding, X., Shields, J., Allen, R. & Hussey, R.S. Molecular cloning and characterisation of a

venom allergen AG5-like cDNA from Meloidogyne incognita. Int J Parasitol 30, 77-81

(2000).

49. Prior, A. et al. A surface-associated retinol- and fatty acid-binding protein (Gp-FAR-1) from

the potato cyst nematode Globodera pallida: lipid binding activities, structural analysis and

expression pattern. Biochem J 356, 387-394 (2001).

Page 139: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

74

50. Garofalo, A. et al. The FAR proteins of filarial nematodes: secretion, glycosylation and lipid

binding characteristics. Mol Biochem Parasitol 122, 161-170 (2002).

51. Jones, J.T., Reavy, B., Smant, G. & Prior, A.E. Glutathione peroxidases of the potato cyst

nematode Globodera Rostochiensis. Gene 324, 47-54 (2004).

52. Jaubert, S. et al. Comparative analysis of two 14-3-3 homologues and their expression pattern

in the root-knot nematode Meloidogyne incognita. Int J Parasitol 34, 873-880 (2004).

53. Jaubert, S. et al. In planta secretion of a calreticulin by migratory and sedentary stages of root-

knot nematode. Mol Plant Microbe Interact 18, 1277-1284 (2005).

54. Gao, B. et al. Characterisation and developmental expression of a chitinase gene in Heterodera

glycines. Int J Parasitol 32, 1293-1300 (2002).

55. Shingles, J., Lilley, C.J., Atkinson, H.J. & Urwin, P.E. Meloidogyne incognita: molecular and

biochemical characterisation of a cathepsin L cysteine proteinase and the effect on parasitism

following RNAi. Exp Parasitol 115, 114-120 (2007).

56. Vanholme, B. et al. Secretions of plant-parasitic nematodes: a molecular update. Gene 332,

13-27 (2004).

57. Blanchard, A., Esquibet, M., Fouville, D. & Grenier, E. Ranbpm homologue genes

characterised in the cyst nematodes Globodera pallida and Globodera `mexicana'.

Physiological and Molecular Plant Pathology 67, 15-22 (2005).

58. Tytgat, T. et al. A new class of ubiquitin extension proteins secreted by the dorsal pharyngeal

gland in plant parasitic cyst nematodes. Mol Plant Microbe Interact 17, 846-852 (2004).

59. Bekal, S., Niblack, T.L. & Lambert, K.N. A chorismate mutase from the soybean cyst

nematode Heterodera glycines shows polymorphisms that correlate with virulence. Mol Plant

Microbe Interact 16, 439-446 (2003).

60. Lambert, K.N., Allen, K.D. & Sussex, I.M. Cloning and characterization of an esophageal-

gland-specific chorismate mutase from the phytoparasitic nematode Meloidogyne javanica.

Mol Plant Microbe Interact 12, 328-336 (1999).

61. Jones, J.T. et al. Characterization of a chorismate mutase from the potato cyst nematode

Globodera pallida. Molecular Plant Pathology 4, 43-50 (2003).

62. Wang, X.H. et al. A parasitism gene from a plant-parasitic nematode with function similar to

CLAVATA3/ESR (CLE) of Arabidopsis thaliana. Molecular Plant Pathology 6, 187-191

(2005).

63. Scholl, E.H., Thorne, J.L., McCarter, J.P. & Bird, D.M. Horizontally transferred genes in

plant-parasitic nematodes: a high-throughput genomic approach. Genome Biol 4, R39 (2003).

64. Huang, G. et al. A profile of putative parasitism genes expressed in the esophageal gland cells

of the root-knot nematode Meloidogyne incognita. Mol Plant Microbe Interact 16, 376-381

(2003).

65. Yan, Y., Smant, G. & Davis, E. Functional screening yields a new beta-1,4-endoglucanase

gene from Heterodera glycines that may be the product of recent gene duplication. Mol Plant

Microbe Interact 14, 63-71 (2001).

66. Huang, G. et al. A root-knot nematode secretory peptide functions as a ligand for a plant

transcription factor. Mol Plant Microbe Interact 19, 463-470 (2006).

67. Huang, G.Z., Allen, R., Davis, E.L., Baum, T.J. & Hussey, R.S. Engineering broad root-knot

resistance in transgenic plants by RNAi silencing of a conserved and essential root-knot

nematode parasitism gene. Proceedings of the National Academy of Sciences of the United

States of America 103, 14302-14306 (2006).

68. Salinas, A.E. & Wong, M.G. Glutathione S-transferases--a review. Curr Med Chem 6, 279-

309 (1999).

69. Lindblom, T.H. & Dodd, A.K. Xenobiotic detoxification in the nematode Caenorhabditis

elegans. J Exp Zoolog A Comp Exp Biol 305, 720-730 (2006).

70. Menzel, R., Bogaert, T. & Achazi, R. A systematic gene expression screen of Caenorhabditis

elegans cytochrome P450 genes reveals CYP35 as strongly xenobiotic inducible. Arch

Biochem Biophys 395, 158-168 (2001).

Page 140: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

75

71. Menzel, R., Rodel, M., Kulas, J. & Steinberg, C.E. CYP35: xenobiotically induced gene

expression in the nematode Caenorhabditis elegans. Arch Biochem Biophys 438, 93-102

(2005).

72. Reichert, K. & Menzel, R. Expression profiling of five different xenobiotics using a

Caenorhabditis elegans whole genome microarray. Chemosphere 61, 229-237 (2005).

73. Menzel, R. et al. Cytochrome P450s and short-chain dehydrogenases mediate the

toxicogenomic response of PCB52 in the nematode Caenorhabditis elegans. J Mol Biol 370, 1-

13 (2007).

74. Fanelli, E., Dileo, C., Di Vito, M. & De Giorgi, C. Inducible antibacterial defence in the plant

parasitic nematode Meloidogyne artiellia. Int J Parasitol (2007).

75. Cortese, M.R., Di Vito, M. & De Giorgi, C. The expression of the homologue of the

Caenorhabditis elegans lin-45 raf is regulated in the motile stages of the plant parasitic

nematode Meloidogyne artiellia. Mol Biochem Parasitol 149, 38-47 (2006).

76. Schulenburg, H., Kurz, C.L. & Ewbank, J.J. Evolution of the innate immune system: the

worm perspective. Immunol Rev 198, 36-58 (2004).

77. Ewbank, J.J. Signaling in the immune response. WormBook doi/10.1895/wormbook.1.83.1,

http://www.wormbook.org (2006).

78. Pujol, N. et al. A reverse genetic analysis of components of the Toll signaling pathway in

Caenorhabditis elegans. Curr Biol 11, 809-821 (2001).

79. Alegado, R.A. & Tan, M.W. Resistance to antimicrobial peptides contributes to persistence of

Salmonella typhimurium in the C. elegans intestine. Cell Microbiol (2008).

80. Mallo, G.V. et al. Inducible antibacterial defense system in C. elegans. Curr Biol 12, 1209-

1214 (2002).

81. Couillault, C. et al. TLR-independent control of innate immunity in Caenorhabditis elegans by

the TIR domain adaptor protein TIR-1, an ortholog of human SARM. Nat Immunol 5, 488-494

(2004).

82. Pujol, N. et al. Anti-fungal innate immunity in C. elegans is enhanced by evolutionary

diversification of antimicrobial peptides. PLoS Pathogens (Submitted).

83. Plowman, G.D., Sudarsanam, S., Bingham, J., Whyte, D. & Hunter, T. The protein kinases of

Caenorhabditis elegans: a model for signal transduction in multicellular organisms. Proc Natl

Acad Sci U S A 96, 13603-13610 (1999).

84. Manning, G., Plowman, G.D., Hunter, T. & Sudarsanam, S. Evolution of protein kinase

signaling from yeast to man. Trends Biochem Sci 27, 514-520 (2002).

85. Robertson, H.M. & Thomas, J.H. The putative chemoreceptor families of C. elegans.

WormBook doi/10.1895/wormbook.1.66.1, http://www.wormbook.org (2006).

86. Bargmann, C.I. Chemosensation in C. elegans. WormBook doi/10.1895/wormbook.1.123.1,

http://www.wormbook.org (2006).

87. Sambrook, J., Fritsch, E.F. & Maniatis, T. Molecular Cloning: a laboratory manual. 2nd ed.

(Cold Spring Harbor Laboratory Press, N.Y., USA; 1989).

88. McCarter, J.P. et al. Analysis and functional classification of transcripts from the nematode

Meloidogyne incognita. Genome Biol 4, R26 (2003).

89. Dautova, M. et al. Single pass cDNA sequencing - a powerful tool to analyse gene expression

in preparasitic juveniles of the southern root-knot nematode Meloidogyne incognita.

Nematology 3, 129-139.

90. Pertea, G. et al. TIGR Gene Indices clustering tools (TGICL): a software system for fast

clustering of large EST datasets. Bioinformatics 19, 651-652 (2003).

91. Muller, D. et al. A tale of two oxidation states: bacterial colonization of arsenic-rich

environments. PLoS Genet 3, e53 (2007).

92. Kent, W.J. BLAT--the BLAST-like alignment tool. Genome Res 12, 656-664 (2002).

93. Altschul, S.F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database

search programs. Nucleic Acids Res 25, 3389-3402 (1997).

94. Quesneville, H. et al. Combined evidence annotation of transposable elements in genome

sequences. PLoS Comput Biol 1, 166-175 (2005).

Page 141: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

76

95. Bao, Z. & Eddy, S.R. Automated de novo identification of repeat sequence families in

sequenced genomes. Genome Res 12, 1269-1276 (2002).

96. Edgar, R.C. & Myers, E.W. PILER: identification and classification of genomic repeats.

Bioinformatics 21 Suppl 1, i152-158 (2005).

97. Huang, X. On global sequence alignment. Comput Appl Biosci 10, 227-235 (1994).

98. Jurka, J. et al. Repbase Update, a database of eukaryotic repetitive elements. Cytogenet

Genome Res 110, 462-467 (2005).

99. Jurka, J., Klonowski, P., Dagman, V. & Pelton, P. Censor - A program for identification and

elimination of repetitive elements from DNA sequences. Computers & Chemistry 20, 119-121

(1996).

100. Smit, A.F.A., Hubley, R. & Green, P. RepeatMasker, http://repeatmasker.org. (1996-2004).

101. Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res

27, 573-580 (1999).

102. Kolpakov, R., Bana, G. & Kucherov, G. mreps: Efficient and flexible detection of tandem

repeats in DNA. Nucleic Acids Res 31, 3672-3678 (2003).

103. Noirot, C., Gaspin, C., Schiex, T. & Gouzy, J. LeARN: a platform for detecting, clustering

and annotating non-coding RNAs. BMC Bioinformatics 9, 21 (2008).

104. Lowe, T.M. & Eddy, S.R. tRNAscan-SE: a program for improved detection of transfer RNA

genes in genomic sequence. Nucleic Acids Res 25, 955-964 (1997).

105. Griffiths-Jones, S. et al. Rfam: annotating non-coding RNAs in complete genomes. Nucleic

Acids Res 33, D121-124 (2005).

106. Billoud, B., De Paepe, R., Baulcombe, D. & Boccara, M. Identification of new small non-

coding RNAs from tobacco and Arabidopsis. Biochimie 87, 905-910 (2005).

107. Corpet, F. Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res 16,

10881-10890 (1988).

108. Thompson, J.D., Higgins, D.G. & Gibson, T.J. CLUSTAL W: improving the sensitivity of

progressive multiple sequence alignment through sequence weighting, position-specific gap

penalties and weight matrix choice. Nucleic Acids Res 22, 4673-4680 (1994).

109. Steffen, P., Voss, B., Rehmsmeier, M., Reeder, J. & Giegerich, R. RNAshapes: an integrated

RNA analysis package based on abstract shapes. Bioinformatics 22, 500-503 (2006).

110. Harris, T.W. et al. WormBase: a multi-species resource for nematode biology and genomics.

Nucleic Acids Res 32, D411-417 (2004).

111. Li, L., Stoeckert, C.J., Jr. & Roos, D.S. OrthoMCL: identification of ortholog groups for

eukaryotic genomes. Genome Res 13, 2178-2189 (2003).

112. Foissac, S. & Schiex, T. Integrating alternative splicing detection into gene prediction. BMC

Bioinformatics 6, 25 (2005).

113. Bairoch, A. et al. The Universal Protein Resource (UniProt). Nucleic Acids Res 33, D154-159

(2005).

114. Rogers, A. et al. WormBase 2007. Nucleic Acids Res 36, D612-617 (2008).

115. Degroeve, S., Saeys, Y., De Baets, B., Rouze, P. & Van de Peer, Y. SpliceMachine: predicting

splice sites from high-dimensional local context representations. Bioinformatics 21, 1332-

1338 (2005).

116. Gremme, G., Brendel, V., Sparks, M.E. & Kurtz, S. Engineering a software tool for gene

structure prediction in higher organisms. Information and Software Technology 47, 965-978

(2005).

117. Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology

Consortium. Nat Genet 25, 25-29 (2000).

118. Emanuelsson, O., Brunak, S., von Heijne, G. & Nielsen, H. Locating proteins in the cell using

TargetP, SignalP and related tools. Nat Protoc 2, 953-971 (2007).

119. Bendtsen, J.D., Nielsen, H., von Heijne, G. & Brunak, S. Improved prediction of signal

peptides: SignalP 3.0. J Mol Biol 340, 783-795 (2004).

120. Pel, H.J. et al. Genome sequencing and analysis of the versatile cell factory Aspergillus niger

CBS 513.88. 25, 221-231 (2007).

Page 142: Supplementary Data and Methods for: Genome …...pathogenicity, we have sequenced the Southern RKN Meloidogyne incognita genome to high-quality initial draft using a whole-genome shotgun

77

121. Tuskan, G.A. et al. The genome of black cottonwood, Populus trichocarpa (Torr. & Gray).

Science 313, 1596-1604 (2006).

122. Martin, F. et al. The genome of Laccaria bicolor provides insights into mycorrhizal symbiosis.

Nature 452, 88-92 (2008).

123. Dean, R.A. et al. The genome sequence of the rice blast fungus Magnaporthe grisea. Nature

434, 980-986 (2005).

124. Cuomo, C.A. et al. The Fusarium graminearum genome reveals a link between localized

polymorphism and pathogen specialization. Science 317, 1400-1402 (2007).

125. Galagan, J.E. et al. The genome sequence of the filamentous fungus Neurospora crassa.

Nature 422, 859-868 (2003).

126. Rawlings, N.D., Morton, F.R., Kok, C.Y., Kong, J. & Barrett, A.J. MEROPS: the peptidase

database. Nucleic Acids Res 36, D320-325 (2008).

127. Edgar, R.C. MUSCLE: a multiple sequence alignment method with reduced time and space

complexity. BMC Bioinformatics 5, 113 (2004).

128. Edgar, R.C. MUSCLE: multiple sequence alignment with high accuracy and high throughput.

Nucleic Acids Res 32, 1792-1797 (2004).

129. Guindon, S. & Gascuel, O. A simple, fast, and accurate algorithm to estimate large

phylogenies by maximum likelihood. Syst Biol 52, 696-704 (2003).

130. Jones, D.T., Taylor, W.R. & Thornton, J.M. The rapid generation of mutation data matrices

from protein sequences. Comput Appl Biosci 8, 275-282 (1992).

131. McVeigh, P. et al. Analysis of FMRFamide-like peptide (FLP) diversity in phylum Nematoda.

Int J Parasitol 35, 1043-1060 (2005).

132. Husson, S.J., Mertens, I., Janssen, T., Lindemans, M. & Schoofs, L. Neuropeptidergic

signaling in the nematode Caenorhabditis elegans. Prog Neurobiol 82, 33-55 (2007).

133. Li, C. The ever-expanding neuropeptide gene families in the nematode Caenorhabditis

elegans. Parasitology 131 Suppl, S109-127 (2005).

134. Husson, S.J., Clynen, E., Baggerman, G., De Loof, A. & Schoofs, L. Discovering

neuropeptides in Caenorhabditis elegans by two dimensional liquid chromatography and mass

spectrometry. Biochem Biophys Res Commun 335, 76-86 (2005).