15
The phylogeny of acorn weevils (genus Curculio) from mitochondrial and nuclear DNA sequences: the problem of incomplete data Joseph Hughes * and Alfried P. Vogler Department of Biological Sciences, Imperial College London, Silwood Park Campus, Ascot SL5 7PY, UK Department of Entomology, The Natural History Museum, Cromwell Road, London SW7 5BD, UK Received 26 June 2003; revised 4 February 2004 Available online 1 April 2004 Abstract We considered the contribution of two mitochondrial and two nuclear data sets for the phylogenetic reconstruction of 22 species of seed beetles in the genus Curculio (Coleoptera: Cuculionidae). A phylogenetic tree from representatives found on various hosts was inferred from a combined data set of mitochondrial DNA cytochrome oxidase subunit I, mitochondrial cytochrome b, nuclear elongation factor 1a, and nuclear phosphoglycerate mutase, used for the first time as a molecular marker. Separate parsimony analyses of each data set showed that individual gene trees were mainly congruent and often complementary in the support of clades but the analysis was complicated by failure of PCR amplification of nuclear genes for many taxa and hence missing data entries. When the four gene partitions were combined in a simultaneous analysis despite the missing data, this increased the resolution and taxonomic coverage compared to the individual source trees. Alternative approaches of combining the information via supertree methodology produced a comparatively less resolved tree, and hence seem inferior to combining data matrices even in cases where numerous taxa are missing. The molecular data suggest a classification of the European species into two species groups that are in accordance with morphological characteristics but the data do no support any of the previously recognised American species groups. Ó 2004 Elsevier Inc. All rights reserved. Keywords: Missing data; Pglym; COI; Cytb; EF-1a; Coleoptera; Curculio; MRP 1. Introduction The genus Curculio represents a large group of seed beetles (Coleoptera: Curculionidae: Curculioninae: Curculionini) composed of approximately 345 species distributed across Asia, Europe, Africa, and North America (Von Dalla Torre and Schlenkling, 1932). Very little work has been conducted on the classification of this genus although it has been suggested that Curculio can be subdivided into several subgenera (Alonso- Zarazaga et al., 1999). Of the Curculio species in the world, 42 species are known from China and have been classified into seven species groups (Pelsue, 2000; Pelsue and Zhang, 2000). In a treatise on American Curculio, Chittenden (1926) suggested seven species groups to subdivide the 27 North American Curculio based mainly on characters related to length, arcuation, and other structural characters of the rostrum. On the other hand, the 12 European species have not been classified into species groups except for the occasional grouping of four inquiline weevils into the subgenus Balanobius (Hoffmann, 1954). Similarly, Gibson (1969) chose not to follow ChittendenÕs (1926) groupings for the North American species. The Curculio weevils have a characteristically long rostrum and extendable ovipositors, which enable them to lay their eggs in seeds from various families such as Fagaceae, Betulaceae, and Juglandaceae and some spe- cies can be found in various galls on oaks and willow. The rostrum is believed to be the key adaptation that * Corresponding author. Fax: +02075942339. E-mail address: [email protected] (J. Hughes). 1055-7903/$ - see front matter Ó 2004 Elsevier Inc. All rights reserved. doi:10.1016/j.ympev.2004.02.007 Molecular Phylogenetics and Evolution 32 (2004) 601–615 MOLECULAR PHYLOGENETICS AND EVOLUTION www.elsevier.com/locate/ympev

The phylogeny of acorn weevils (genus Curculio) from mitochondrial and nuclear DNA sequences: the problem of incomplete data

Embed Size (px)

Citation preview

Page 1: The phylogeny of acorn weevils (genus Curculio) from mitochondrial and nuclear DNA sequences: the problem of incomplete data

MOLECULARPHYLOGENETICSAND

Molecular Phylogenetics and Evolution 32 (2004) 601–615

EVOLUTION

www.elsevier.com/locate/ympev

The phylogeny of acorn weevils (genus Curculio)from mitochondrial and nuclear DNA sequences:

the problem of incomplete data

Joseph Hughes* and Alfried P. Vogler

Department of Biological Sciences, Imperial College London, Silwood Park Campus, Ascot SL5 7PY, UK

Department of Entomology, The Natural History Museum, Cromwell Road, London SW7 5BD, UK

Received 26 June 2003; revised 4 February 2004

Available online 1 April 2004

Abstract

We considered the contribution of two mitochondrial and two nuclear data sets for the phylogenetic reconstruction of 22 species

of seed beetles in the genus Curculio (Coleoptera: Cuculionidae). A phylogenetic tree from representatives found on various hosts

was inferred from a combined data set of mitochondrial DNA cytochrome oxidase subunit I, mitochondrial cytochrome b, nuclearelongation factor 1a, and nuclear phosphoglycerate mutase, used for the first time as a molecular marker. Separate parsimony

analyses of each data set showed that individual gene trees were mainly congruent and often complementary in the support of clades

but the analysis was complicated by failure of PCR amplification of nuclear genes for many taxa and hence missing data entries.

When the four gene partitions were combined in a simultaneous analysis despite the missing data, this increased the resolution and

taxonomic coverage compared to the individual source trees. Alternative approaches of combining the information via supertree

methodology produced a comparatively less resolved tree, and hence seem inferior to combining data matrices even in cases where

numerous taxa are missing. The molecular data suggest a classification of the European species into two species groups that are in

accordance with morphological characteristics but the data do no support any of the previously recognised American species

groups.

� 2004 Elsevier Inc. All rights reserved.

Keywords: Missing data; Pglym; COI; Cytb; EF-1a; Coleoptera; Curculio; MRP

1. Introduction

The genus Curculio represents a large group of seed

beetles (Coleoptera: Curculionidae: Curculioninae:

Curculionini) composed of approximately 345 species

distributed across Asia, Europe, Africa, and North

America (Von Dalla Torre and Schlenkling, 1932). Very

little work has been conducted on the classification of

this genus although it has been suggested that Curculiocan be subdivided into several subgenera (Alonso-

Zarazaga et al., 1999). Of the Curculio species in the

world, 42 species are known from China and have been

classified into seven species groups (Pelsue, 2000; Pelsue

* Corresponding author. Fax: +02075942339.

E-mail address: [email protected] (J. Hughes).

1055-7903/$ - see front matter � 2004 Elsevier Inc. All rights reserved.

doi:10.1016/j.ympev.2004.02.007

and Zhang, 2000). In a treatise on American Curculio,Chittenden (1926) suggested seven species groups to

subdivide the 27 North American Curculio based mainly

on characters related to length, arcuation, and other

structural characters of the rostrum. On the other hand,

the 12 European species have not been classified into

species groups except for the occasional grouping of

four inquiline weevils into the subgenus Balanobius

(Hoffmann, 1954). Similarly, Gibson (1969) chose not tofollow Chittenden�s (1926) groupings for the North

American species.

The Curculio weevils have a characteristically long

rostrum and extendable ovipositors, which enable them

to lay their eggs in seeds from various families such as

Fagaceae, Betulaceae, and Juglandaceae and some spe-

cies can be found in various galls on oaks and willow.

The rostrum is believed to be the key adaptation that

Page 2: The phylogeny of acorn weevils (genus Curculio) from mitochondrial and nuclear DNA sequences: the problem of incomplete data

602 J. Hughes, A.P. Vogler / Molecular Phylogenetics and Evolution 32 (2004) 601–615

has enabled the genus to diversify and adapt to varyingseed sizes, and is believed to have been instrumental in

the increase of species diversity in the whole Curculi-

onidae (Anderson, 1995). The large morphological di-

versity in the genus in particular in rostrum length might

provide an unambiguous example of adaptive radiation.

Despite their astonishing morphological characteristics,

only a few species have been studied in greater detail.

These include the American pecan weevil, Curculio

caryae, the chestnut weevil, Curculio elephas and the

hazelnut weevil, Curculio nucum, that are considered

crop pests in certain regions (Collins et al., 1997, 1996;

Coutin, 1960; Debouzie and Pallen, 1987, 1996;

Desouhant, 1998; Eikenbary and Raney, 1973; Harris,

1976; Martin, 1949; Menu and Debouzie, 1995; Schraer

et al., 1998). In this study, we begin the process of

constructing a phylogeny of the genus Curculio byconsidering 22 species gathered mainly from North

America and Europe for which no phylogenetic infor-

mation exists. Specifically, we were interested in deter-

mining the relationship of species so that they could be

used for comparative studies of ecological and mor-

phological characters, in particular for presumed adap-

tive morphological characters such as the rostrum

length.For this purpose, it is important to obtain a robust

phylogeny with the widest species coverage, the greatest

accuracy and resolution. It has been broadly suggested

(e.g., Wheeler et al., 1993) that phylogeny is best as-

sessed by using data sets from distinct sources, i.e., it is

believed that �true� genealogies will most accurately be

recovered if data are collected from unlinked character

sets. Thus, in this study we have used four genes to ex-plore the relationships in the genus Curculio. However,

it is rarely possible to obtain a complete data set for all

the taxa. Missing data is recognised as a nuisance factor

in all statistical analyses including phylogenetic recon-

struction (e.g., Platnick et al., 1991; Wiens and Reeder,

1995; Wilkinson, 1995). Studies that attempt to deal

with the problem of missing data have concentrated

their efforts on the effect of incomplete taxa sampling(e.g., Wiens and Reeder, 1995; Wilkinson, 1995). How-

ever, incomplete sets of characters may be equally

common; for example it is often difficult to obtain se-

quence data for all genes chosen. Reducing the analysis

to include only the taxa scored for all characters is often

undesirable because it limits the taxonomic scope of the

study and may be impossible if different sets of charac-

ters are missing in different sets of taxa.The justification for the inclusion or exclusion of

these characters has not been thoroughly addressed.

Previous studies have suggested that increasing the

number of characters should generally increase phylo-

genetic accuracy (e.g., Hillis et al., 1994) but increasing

the amount of missing data may decrease accuracy and

increase the number of most-parsimonious trees

(e.g., Huelsenbeck, 1991; Wiens and Reeder, 1995).Wiens (1998) showed through simulations that the ad-

dition of a set of characters with missing data is gener-

ally more likely to increase phylogenetic accuracy.

Although missing data are not usually presented as an

issue in molecular phylogenetic analysis, missing char-

acters do frequently occur in such data sets. The meth-

ods used for reconstructing phylogenies from these types

of incomplete data sets are of particular importance tothe debate of supertree versus supermatrix approaches

for reconstructing very large phylogenies from diverse

data sets. Indeed, when the �Tree of Life� will eventuallybe assembled, the characters used for its reconstruction

will inevitably be different depending on the taxonomic

group.

This study provides an empirical example of the effect

of increasing the number of characters and the amountof missing data on accuracy and resolution of a phylo-

genetic analysis. This first phylogenetic tree of the genus

Curculio will be discussed with respect to limitations of

incomplete species sampling and tree building method-

ologies with incomplete data.

2. Materials and methods

2.1. Insects sampling and DNA extraction

The collection localities and collector are presented in

the Appendix A. Multiple samples were analysed for

most species. Identification of species was based on the

external morphological characteristics using identifica-

tion keys in Gibson (1969) and Hoffmann (1954). Spe-cies in the genus Curculio are at least partly defined on

characteristics of the rostrum and the femoral tooth size.

Two species were unidentified due to the apparent lack

of identification keys and voucher specimens in the

Natural History Museum. Adult weevils were preserved

in absolute ethanol and were used for phenol/chloro-

form extraction of total DNA (Vogler et al., 1993).

Specimens are kept in ethanol in the collection of theNatural History Museum, London.

2.2. Molecular markers and PCR amplification

The use of mitochondrial DNA (mtDNA) sequence

data has become the standard for many phylogenetic

studies (Caterino et al., 2000), examples are numerous in

the Coleoptera (e.g., Cognato and Sperling, 2000; Mauset al., 2001). COI has been used to resolve phylogenetic

relationships at many taxonomic levels therefore we

believe the use of this gene is appropriate. Another

frequently used mitochondrial locus in insect phyloge-

netics is cytochrome b (Cytb), which has successfully

been used to recover inter- and intrageneric phylogenies

(Jermiin and Crozier, 1994; Stone and Cook, 1998).

Page 3: The phylogeny of acorn weevils (genus Curculio) from mitochondrial and nuclear DNA sequences: the problem of incomplete data

J. Hughes, A.P. Vogler / Molecular Phylogenetics and Evolution 32 (2004) 601–615 603

We used primers Pat and Jerry to amplify a region of829 bp of COI (Loxdale and Lushai, 1998) and for Cytb,

universal primers CB1 and CB2 amplifying 487 bp were

used (Simon et al., 1994).

In addition, partial sequences from the nuclear gene

encoding elongation factor 1a (EF-1a) were used. This isa single-copy gene that shows considerable conservatism

in amino acid substitution rates. EF-1a has also been

very informative in a wide range of phylogenetic anal-yses of other arthropods and insects (e.g., Cho et al.,

1995; Cryan et al., 2000) and, used in combination with

other nuclear and mitochondrial genes has provided

crucial data for obtaining well resolved topologies

(Cognato and Vogler, 2001; Sequeira and Farrell, 2001;

Sequeira et al., 2000). For the amplification of the EF-1afragment, we first tested the primers Adam and Grizzly

(see Table 1) and then designed some primers based onthe alignment of these sequences and the sequences of

Taphrorychus bicolor (Coleoptera: Scolytidae) from

GenBank (Accession No. AF186663) resulting in the

amplification of a 613 bp long fragment.

Phosphoglycerate mutase (Pglym) was chosen as a

second nuclear marker because enzyme encoding genes

have successfully been used for phylogeny reconstruc-

tion for a number of insect taxa, for example: enolasefor Curculionidae (Farrell et al., 2001), glucose-6-phos-

phate dehydrogenase for Culicidae (Krzywinski et al.,

2001) and alcohol dehydrogenase for Drosophilidae

(Remsen and O�Grady, 2002). The isolation and char-

acterisation of Pglym as a molecular marker in the genus

Curculio was achieved through a cDNA library of

Curculio glandium (Accession No. BQ476301; Theodo-

rides et al., 2002). Phosphoglycerate mutase provided ahigh match to Drosophila melanogaster Pglym78 (75%,

E-10e�111, Accession No. L27654, Currie and Sullivan,

1994) making the design of primers more reliable and

more likely to work across a broad range of insect

species; the cDNA sequence of C. glandium was

long (1013 nucleotides) with only two introns providing

a large region for primer design; finally, Pglym from

Table 1

Primers used in the present study and primer specific annealing temperature

Gene Primer name Primer sequence (50–3

Mitochondrial

COI C1-J-2183 (Jerry) CAA CAT TTA TTT

L2-N-3014 (Pat) TCC AAT GCA CTA

Cytb CB-J-10933 (CB1) TAT GTA CTA CCA

CB-N-11367 (CB2) ATT ACA CCT CCT

Nuclear

EF1a Grizzly GGA GTG AAA CA

Adams ACA TTG TCA CCA

EF1F TGG TGA ATT TG

EF1R TAG GTG GGT TG

Pglym PGlymFc GAT GCC (AT)A(C

PglymRc TAC CAT GGG CA

insects has not been well characterised and to ourknowledge there has only been one report concerning

this enzyme in insects (Currie and Sullivan, 1994), thus

providing the opportunity to study the molecular evo-

lution of this gene.

Sequences for all primers used in this study are given

in Table 1. These primers were used for direct se-

quencing from total genomic DNA. Each PCR cocktail

contained 2.5 ll of 10� PCR buffer, 0.5–2 ll of 50mMof MgCl2, 0.5 ll of 10mM dNTPs, 0.5 ll of each primer

(10 lM), 0.1–0.2 ll of Taq, 1 ll of DNA template and

was made up to a total volume of 25 ll with ddH2O. The

PCR was performed in a Biometra UNOII Thermocy-

cler: 3min at 94 �C; 35 cycles of 30 s at 94 �C, 30 s at

X �C (refer to Table 1 for primer specific annealing

temperatures), 1min at 72 �C; finally, 5min at 72 �C.Annealing temperature and number of cycles wereusually varied independently to optimise the conditions

for each primer pair. Cycle sequence reactions were run

on an ABI 3700 sequencer using BigDye (Perkin–Elmer)

technology. Sense and anti-sense strands were edited

using Sequencher 4.1 (Gene Codes, Ann Harbour, MI,

USA).

2.3. Phylogenetic construction

There was no length variation in COI, and a loss of

six base pairs in Cytb in C. caryae. The alignment of EF-

1a and Pglym was achieved with CLUSTAL X (Aladdin

Systems, Heidelberg, Germany) using default gap costs,

and alignments were then subjected to eye inspection.

The EF-1a sequences varied in length in the intron re-

gion from 42 to 129 bp. Multiple copies of EF-1a havebeen described from Drosophila flies (Hovemann et al.,

1988) and Apis bees (Danforth and Ji, 1998), and this is

probably a universal feature throughout the insect

orders (Jordal, 2002). Because �cryptic� paralogous

sequences would interfere with phylogeny reconstruc-

tion, attention was paid to include only orthologous

sequences with shared intron structure in the

s

0) Temperature (�C)

TGA TTT TTT GG 50

ATC TGC CAT ATT A

TGA GGA CAA ATA TC 45

AAT TTA TTA GGA AT

A CTT ATC GTT GG 45.5–48

GGG ACA GC

A GGC TGG TAT CT 47–48

T TCT TGG AGT CA

T) TT(AG) AGC GAA A 50

G CAA TAA GAA

Page 4: The phylogeny of acorn weevils (genus Curculio) from mitochondrial and nuclear DNA sequences: the problem of incomplete data

604 J. Hughes, A.P. Vogler / Molecular Phylogenetics and Evolution 32 (2004) 601–615

phylogenetic analysis. Intron 1 of Pglym varied in lengthfrom 15 to 116 bp, there was no length variation within

exon 2 of Pglym. The direct sequencing did not suggest

that any of the individuals sequenced were heterozygote

for either EF-1a or Pglym so we did not believe it nec-

essary to clone the PCR product. Sequences used in the

present analysis are accessible in GenBank (AY327631–

AY327759).

Standard phylogenetic analyses were performed inPAUP 4.0b4a (Swofford, 1999). The ts/tv ratio was es-

timated by maximum likelihood on a neighbour-joining

tree considering all codon positions. Genetic distances

for each data set were estimated using Tamura–Nei

distance (Tamura and Nei, 1993). Curculio pyrrhoceras

was chosen as an outgroup; it presented the greatest

divergence to the other Curculio species for all four

genes; and, it exhibits a very different life-history strat-egy to most of the other Curculio species analysed. All

substitutions were weighted equally and gaps treated as

missing data. The maximum number of trees that could

be saved during the heuristic search was set to 10,000

(most of the searches never came close to that limit).

Most parsimony (MP) trees were determined by heu-

ristic search of 100 random addition replicates with tree

bisection-reconnection (TBR) branch-swapping with thesteepest decent option in effect. Confidence in each node

was assessed by bootstrapping 1000 replicates, heuristic

search of 20 random addition analysis with TBR branch

swapping, steepest descent option activated to limit the

analysis time. Bremer support (BS) (Bremer, 1988) was

calculated on the shortest trees using constraint files

generated in TreeRot (Sorenson, 1999).

2.4. Incongruence length difference and significance tests

To examine the difference in phylogenetic signal be-

tween gene partitions, incongruence length differences

(ILDs) were calculated as described by Farris et al.

(1994). This index measures the amount of extra ho-

moplasy that results from the combining of two or more

data partitions in a simultaneous analysis. The �totalevidence� matrix was divided into four data partitions

representing Pglym, EF-1a, Cytb, and COI. Incongru-

ence between data partitions was assessed using the

partition homogeneity test in PAUP (Farris et al., 1994),

with only the taxa common to both partitions included

in the analysis. The statistical significance of the incon-

gruence length difference (ILD) was assessed by exe-

cuting 999 random partitions of the data.

2.5. Defining character support and conflict

Partition Bremer support (PBS) was used to show the

relative contribution of each gene partition to the Bre-

mer support of the simultaneous analysis (Baker and

DeSalle, 1997). Values can be positive, negative or zero

and the sum of all the partitioned Bremer support valuesat a node will equal the Bremer support value for that

node. A positive PBS suggests support for the node by

that gene, whereas a negative PBS indicates that evi-

dence in the data partition is inconsistent with that

node. The partitioned Bremer support values were cal-

culated using the partitioned constraint file in TreeRot

(Sorenson, 1999). When data sets are merged in simul-

taneous analyses, the combined matrix sometimes sup-ports unique relationships not favoured by any of the

individual data sets. In such cases, common character

support for these relationships is revealed that is

otherwise �hidden� in the separate data partitions. This

type of hidden support and conflict was quantified as

hidden branch support (HBS) (Gatesy et al., 1999).

2.6. Analysing the effect of missing data

It has been widely recognised that the inclusion of

taxa with much missing data may lead to a dramatic

increase in the number of MPTs. To assess this effect,

the ratio of the number of MPTs in the combined

analysis over the number of MPTs in the various data

partitions was calculated. The accuracy of a data set

containing missing data was measured as the similaritybetween the simultaneous analysis and the individual

data partitions. Similarity was calculated in PAUP

4.0b4a (Swofford, 1999) based on both the standard �d �statistic, which measures the quartets which are resolved

and different between trees (Estabrook et al., 1985) and

the �symmetric difference� (SD), which is based on the

proportion of different bipartitions between trees (Penny

and Hendy, 1985).

2.7. MRP reconstruction

The gene trees obtained from individual partitions

were also combined using a supertree methodology.

Matrix representation with parsimony (MRP) was used

because this method is applicable whether or not the

source trees are congruent (Baum, 1992; Ragan, 1992).The strict consensus of each gene tree containing over-

lapping terminal taxa was used for recoding each node

as binary characters in RadCon following Baum/

Ragan�s coding scheme (Thorley and Page, 2000). For

the supertree reconstruction, a heuristic search under

maximum parsimony was performed using 1000 repli-

cates of random addition sequence using TBR and

keeping only 100 trees at each replicate. In the parsi-mony analysis, characters of the binary matrix were

considered as unordered as they produce slightly better

topologies (Salamin et al., 2002). The Templeton�sWilcoxon signed-ranks test (Templeton, 1983) and the

Kishino–Hasegawa (KH) test (Kishino and Hasegawa,

1989) as implemented in PAUP were used to compare

the differences in topologies between the �total evidence�

Page 5: The phylogeny of acorn weevils (genus Curculio) from mitochondrial and nuclear DNA sequences: the problem of incomplete data

J. Hughes, A.P. Vogler / Molecular Phylogenetics and Evolution 32 (2004) 601–615 605

simultaneous analysis phylogeny tree and the �supertree�reconstruction.

3. Results

3.1. Sequence variation

The length of the sequenced COI fragment was829 bp for the 41 specimens evaluated (89%) and of these

characters, 331 were variable between taxa, 292 of which

were parsimony informative. The COI gene is approxi-

mately 65% A+T biased, and the transition/transver-

sion ratio for COI considering all positions was 1.47.

Cytochrome b (487 bp) was successfully amplified and

sequenced for 34 taxa (74%). Alignment of all sequences

yielded 198 variable characters (excluding a single in-sertion/deletion of 6 bp in C. caryae) of which 173 were

parsimony informative characters. Cytb shows the

highest frequencies of A+T with a percentage of 70%

and a ts/tv ratio of 1.47. These mitochondrial sequences

verify the general observation from insects of a strong

A+T bias (e.g., Crozier and Crozier, 1993). The A+T

bias is lower in the two nuclear genes: 62 and 59% for

EF-1a and Pglym, respectively, and so is the ts/tv ratio(1.18 and 1.31). The 33 taxa (72%) sequenced for EF-1ahad 27.4% of their 613 sites variable with only 78 phy-

logenetically informative and the 21 sequences (46%) for

Pglym had 23.4% of their sites variable with only 73

variable sites being parsimony informative (refer to

Table 2).

3.2. Phylogenetic analysis

Unweighted parsimony analysis of each gene results

in many shortest trees (see Fig. 1). The individual data

Table 2

Sequence characteristics and tree statistics for the individual genes and the c

COI Cytb EF1a

No. of taxa 41 34 33

No. of species 21 16 18

Sequence length 829 487 613

Indels (bp) 0 6 42–129

Percent variable sites (%) 40 40.6 27.4

Informative characters 292 173 78

Mean base frequency

A 0.287 0.331 0.303

C 0.227 0.199 0.212

G 0.106 0.104 0.168

T 0.378 0.365 0.315

Ti/Tv ratio 1.469 1.477 1.184

Pairwise sequence divergencea 13.8� 3.9 14.7� 4.6 4.3� 2

Steps 1038 519 279

CI 0.455 0.505 0.763

RI 0.734 0.764 0.749

aNucleotide distance corrected using Tamura and Nei�s (1993) method. M

sets support a variety of topologies. The venosus–pell-itus–elephas clade is more basal in the Cytb tree and

C. caryae is more basal in the COI phylogeny. The RI as

a measure for the degree to which the individual data

sets are congruent internally decreased in the following

order (Pglym>Cytb>EF-1a>COI, Table 2). All gene

trees show some degree of resolution and provide sup-

port at different levels; EF-1a provides better resolution

mainly at deeper levels within the genus; COI resolvedmainly at the tips of the phylogeny. However, results

from the independent analyses of COI, EF-1a, Cytb,

and Pglym are in conflict, in particular between Cytb

and the three other genes including the other mito-

chondrial marker (COI: ILD¼ 20, p < 0:001). The main

topological incongruence is the position of the venosus–

pellitus–elephas clade, which is more basal in the Cytb

phylogeny than in the other three genes, however withmoderate support only. On the other hand, the result of

the partition homogeneity test for the two nuclear genes

EF-1a and Pglym indicate that the nuclear genes sup-

port the same tree (Fig. 2, ILD¼ 1, p¼ n.s.). The rate of

change is greater in the mitochondrial genes than in the

other genes, with pairwise average divergence in Cytb

and COI at 14.7 and 13.8%, respectively. This level of

divergence, the A+T bias, as well as high transition/transversion ratio have been shown to affect the signif-

icance level of the ILD test (Dolphin et al., 2000), and

hence there is little empirical basis for conflicting signal

(Fig. 2).

When all genes were combined to obtain a data set

comprised of 2514 characters, of which 616 were parsi-

mony informative, the simultaneous analysis tree is al-

most fully resolved with BS scores ranging from +1 to+51 and a majority of BS scores above +6. To determine

the relative support of the different gene partitions to the

branch support of the simultaneous analysis tree

ombined analyses

Pglym mtDNA nucDNA All genes

21 46 37 46

13 22 19 22

585 1316 1198 2514

15–116 — — —

23.4 40.3 25.4 32.8

73 465 151 616

0.318 0.309 0.310 0.309

0.217 0.213 0.214 0.214

0.191 0.105 0.1795 0.142

0.273 0.372 0.294 0.333

1.310 — — —

.7 5.5� 3.2 — — —

204 2011 476 2094

0.819 0.459 0.790 0.501

0.792 0.735 0.771 0.797

ean�SD.

Page 6: The phylogeny of acorn weevils (genus Curculio) from mitochondrial and nuclear DNA sequences: the problem of incomplete data

Fig. 1. The phylogenies of two mitochondrial (COI and Cytb) and two nuclear genes (EF-1a and Pglym) for the genus Curculio. The COI gene tree is

the consensus of 21 most-parsimonious trees (MPTs) (length¼ 1038, CI¼ 0.455, RI¼ 0.734). The Cytb gene tree is a consensus of 70 MPTs

(length¼ 519, CI¼ 0.505, RI¼ 0.764). The EF-1a gene is a consensus of 8159 MPTs (length¼ 279, CI¼ 0.763, RI¼ 0.749). The Pglym gene tree is a

consensus of 55 MPTs (length¼ 204, CI¼ 0.819, RI¼ 0.792). Support for each node is represented by the bootstrap values (%) above the branch

(shown only when >50%) and the Bremer support below the branch.

606 J. Hughes, A.P. Vogler / Molecular Phylogenetics and Evolution 32 (2004) 601–615

Page 7: The phylogeny of acorn weevils (genus Curculio) from mitochondrial and nuclear DNA sequences: the problem of incomplete data

Fig. 2. The mitochondrial and nuclear phylogenies for combined data sets. The mtDNA gene tree is a consensus of 2011 MPTs (length¼ 1601,

CI¼ 0.459, RI¼ 0.735) for the combined analysis of COI and Cytb. The nuclear DNA gene tree is a consensus of 7934 MPTs (length¼ 476,

CI¼ 0.79, RI¼ 0.77) of the two nuclear genes EF-1a and Pglym. Indices for node supports are as in Fig. 1.

J. Hughes, A.P. Vogler / Molecular Phylogenetics and Evolution 32 (2004) 601–615 607

(Fig. 3), the partitioned Bremer support (PBS) was cal-

culated. Most of the support for the combined analysisphylogeny comes from COI (Table 3). The evidence

from the Cytb data is inconsistent with the other three

genes for 12 out the 33 nodes. Pglym is of minimum use

as the PBS scores of zero indicate the indifference of this

data set at 27 nodes out of 33 and the data set only

provides net positive support for four nodes. The limi-

tations of this gene is probably a result of the small

number of taxa sampled. Nodes 6, 22, and 33 have nonegative PBS scores, i.e., all the data partitions support

these nodes; the other nodes show a more even mixture

of positive and negative PBS scores. Nodes 9 and 10 are

the only nodes where three partitions are negative but

positive support rests on the signal from COI. Themajority of nodes are characterised by hidden conflict,

as measured by HPBS (Gatesy et al., 1999) in the si-

multaneous analysis. Nonetheless, six nodes have posi-

tive HBS, all of which are nodes absent from all of the

individual gene trees due either to missing data or lack

of resolution but supported in the simultaneous analysis.

3.3. The effect of missing data

The accuracy measured by both �d� and SD agree

closely and there is no general trend for decreased

Page 8: The phylogeny of acorn weevils (genus Curculio) from mitochondrial and nuclear DNA sequences: the problem of incomplete data

Fig. 3. Consensus tree of 5795 MPTs (length¼ 2094, CI¼ 0.53, RI¼ 0.73) obtained from the simultaneous analysis of COI, Cytb, EF-1a, and Pglym.

Bootstrap values are above the branch and Bremer support below. See Table 3 for the contribution of each gene to the Bremer support at the

corresponding circled node.

608 J. Hughes, A.P. Vogler / Molecular Phylogenetics and Evolution 32 (2004) 601–615

accuracy with increasing percentage of missing data but

there is an increase in the number of resolved nodes(Table 4). The addition of a character set with missing

data either increased or had little effect on the number of

MPTs. The largest increase in number of MPTs oc-

curred when the Cytb and COI data sets were combined

for the mitochondrial phylogeny (Fig. 2 and Table 4).

Although this could be as a result of the missing data in

the Cytb data set, it could also be a result of the con-flicting clades between the COI and Cytb phylogeny.

The addition of Pglym containing missing data to the

EF-1a data set decreases the resolution of the tree even

though there is little change in the number of MPTs

(Fig. 2 and Table 4). Only the simultaneous analysis of

Page 9: The phylogeny of acorn weevils (genus Curculio) from mitochondrial and nuclear DNA sequences: the problem of incomplete data

Table 3

Partitioned Bremer support for the simultaneous analysis tree (Fig. 3)

Node

No.Bremer

support

Hidden

branch

support

Individual gene partitions mtDNA/nuclear

Pglym EF1 Cytb COI Nuclear mtDNA

1 4 4 0.00 0.46 )0.23 3.77 0.46 3.54

2 37 0 0.00 )0.67 0.33 37.33 )0.67 37.67

3 18 0 0.00 3.28 )0.14 14.86 3.28 14.72

4 1 1 1.00 0.67 )0.33 )0.33 1.67 )0.675 1 0 0.00 0.06 )0.03 0.97 0.06 0.94

6 19 )19 2.00 2.22 )0.11 14.89 4.22 14.78

7 9 )11 0.00 )0.67 0.33 9.33 -0.67 9.67

8 1 )1 0.00 0.80 0.10 0.10 0.80 0.20

9 11 )21 )0.50 )0.67 )18.67 30.83 )1.17 12.17

10 4 )4 )2.00 )0.67 )23.67 30.33 )2.67 6.67

11 1 )1 0.00 2.53 )1.27 )0.27 2.53 )1.5312 32 )3 0.00 0.33 )0.17 31.83 0.33 31.67

13 1 0 0.00 1.58 )0.67 0.08 1.58 )0.5814 1 )1 0.00 1.93 )0.87 )0.07 1.93 0.93

15 14 )42 0.00 0.33 )0.17 13.83 0.33 13.67

16 1 0 0.00 0.04 0.98 )0.02 0.04 0.96

17 15 )30 0.00 1.22 13.89 )0.11 1.22 13.78

18 13 )19 0.00 1.22 4.89 6.89 1.22 11.78

19 6 )18 2.00 1.67 2.17 0.17 3.67 2.33

20 2 2 1.00 0.33 0.33 0.33 1.33 0.67

21 37 37 0.00 0.33 )0.17 36.83 0.33 36.67

22 8 8 0.00 0.00 0.00 8.00 0.00 8.00

23 20 0 0.00 )0.14 22.07 )1.93 )0.14 20.14

24 1 0 0.00 2.33 )1.17 )0.17 2.33 )1.3325 4 )12 0.00 0.41 )0.21 3.79 0.41 3.59

26 11 11 0.00 1.33 )0.67 10.33 1.33 9.67

27 4 0 0.00 )0.08 0.04 4.04 )0.08 4.08

28 3 )9 0.00 0.33 2.83 )0.17 0.33 2.67

29 2 )15 0.00 )0.17 0.08 2.08 )0.17 2.17

30 33 )53 0.00 )0.67 0.33 33.33 )0.67 33.67

31 10 )20 0.00 0.33 )0.17 9.83 0.33 9.67

32 3 )6 0.00 2.43 )0.77 1.33 2.43 0.57

33 51 0 0.00 3.33 8.33 39.33 3.33 47.67

Total PBS 378 3.50 25.82 7.26 341.42 29.32 348.68

PBS/PI 0.61 0.05 0.33 0.04 1.17 0.19 0.75

PBS/steps 0.18 0.02 0.09 0.01 0.33 0.06 0.17

The contribution of each gene (and each of mitochondrial and nuclear data sets) to the Bremer support at the corresponding nodes. The total PBS

indices standardised by the number of phylogenetic informative characters and the number of steps. Nodes present in both the simultaneous analysis

and the individual gene partitions are marked in bold.

Table 4

Comparison of the level of resolution for the different combinations of data sets with various degrees of missing data

COI Cytb EF1-a PGlym mtDNA nucDNA SA

Percentage of missing taxa (%) 11 26 28 54 0 19 0

No. of MPTs 21 70 8159 55 2011 7934 5795

No. of resolved nodes 31 24 19 13 31 20 33

�d� Statistic 12 14 19 10 6 24 0

SD 18 17 15 9 14 21 0

Total PBS 341.42 7.26 25.82 3.50 348.68 29.32 378

Total BS 367 188 46 40 362 40 641

Accuracy was measured as the similarity of the separate analysis to the simultaneous analysis using both the �d� statistic and the symmetric

difference (SD).

J. Hughes, A.P. Vogler / Molecular Phylogenetics and Evolution 32 (2004) 601–615 609

all four genes increases the resolution of the phylogeny

but the tree looses overall support as shown by the lower

total PBS than the total BS of the separate analyses

(Table 4).

3.4. MRP reconstruction

Using Baum and Ragan�s coding scheme to determine

the binary matrix of each source tree, the parsimony

Page 10: The phylogeny of acorn weevils (genus Curculio) from mitochondrial and nuclear DNA sequences: the problem of incomplete data

610 J. Hughes, A.P. Vogler / Molecular Phylogenetics and Evolution 32 (2004) 601–615

analysis yielded 3063 most-parsimonious trees, a strictconsensus of which was used to obtain a consensus

composite tree (Fig. 4). When the character matrix of

the combined data set was constrained to this topology,

the KH-test rejected the supertree (t ¼ 3:2, p < 0:001) asa suitable alternative to the parsimony tree derived from

the combined data. Weighting the contributions of in-

dividual trees under different schemes was attempted to

retrieve a more resolved supertree. This was done under

Fig. 4. Consensus composite tree of 3063 MPTs (length¼ 93,

CI¼ 0.871, RI¼ 0.976) obtained with Baum/Ragan coding scheme

using the four consensus gene trees as source trees. Bremer support are

shown below the branch. This tree is significantly different from the

�total evidence� phylogeny (Kishino–Hasegawa test: t ¼ 3:2, p < 0:001;

Templeton–Wilcoxon signed-ranks: z ¼ �3:1, p < 0:001).

MRP by weighting characters in the new MRP matrix.Weighting the trees according to the number of parsi-

mony informative characters of each source tree and

even weighting the highly resolved COI source tree to

the maximum of 32,767 did not help retrieve a more

resolved phylogeny.

4. Discussion

4.1. Combined and separate analysis of gene data

In this study, we based the phylogeny on four genes

(COI, Cytb, EF-1a, and Pglym). Although Cytb and

COI should reflect a single genetic linkage group with

the same phylogenetic history, different topologies were

obtained from the data partitions. The conflict betweenthe data sets is indicated by negative PBS scores for

Cytb at several nodes in the simultaneous analysis. A

number of reasons can be put forward for the differences

in topology such as the sampling error with respect to

the slightly different set of taxa used in each data set, the

different stochastic processes acting on the characters or

possibly the true differences of branching histories of

different genes. First, even though the ILD test is fre-quently used to determine incongruence between data

sets, the p-value of the partition homogeneity test can be

greatly exaggerated, as �noise� can affect the level of in-

congruence (Dolphin et al., 2000). Hence, what consti-

tutes conflict between both partitions may be partly a

consequence of the severe AT bias and great rate het-

erogeneity in mtDNA, factors which impact our ability

to recover phylogenetic trees, in particular using parsi-mony reconstruction (Yang, 1998). It is remarkable that

the combined mtDNA data exhibit almost no apparent

ILD when combined with both nuclear genes, and hence

whatever internal conflict exists between both mtDNA

partitions, does not obscure a common phylogenetic

signal that is apparently contained in all four gene

partitions.

While we chose to combine the data sets, the issue ofcombining data for phylogenetic analysis is debatable

without a current consensus (Bull et al., 1993; De Que-

iroz et al., 1995; Huelsenbeck et al., 1996; Miyamoto

and Fitch, 1995). Although data set homogeneity

guarantees increasing phylogenetic accuracy with data

set combination, combining heterogeneous data can also

increase accuracy, even if the analysis does not explicitly

incorporate that heterogeneity (Barker and Lutzoni,2002). In this case, EF-1a is useful for resolving the base

of the phylogeny whereas COI resolves nodes closer to

the tips of the tree and the combination of the two im-

proves the resolution of the full tree. Others have argued

that varying levels of homoplasy in different data sets

might contribute to an overall robust signal (Barrett

et al., 1991; Nixon and Carpenter, 1996; Vidal and

Page 11: The phylogeny of acorn weevils (genus Curculio) from mitochondrial and nuclear DNA sequences: the problem of incomplete data

J. Hughes, A.P. Vogler / Molecular Phylogenetics and Evolution 32 (2004) 601–615 611

Lecointre, 1998; Wenzel and Siddall, 1999) and thiswould appear to be the case in this study as the support

for a majority of branches increases in the combined

analysis. According to the PBS, COI is the most influ-

ential data set in the simultaneous analysis, the total BS

of the COI gene tree is 367 and the total PBS of the

simultaneous analysis is 378.

Using the PBS to assess the utility of different genes

clearly depends on the topology of the simultaneousanalysis hypothesis and, therefore, can be sensitive to

the addition of a few characters that change the topol-

ogy of the most-parsimonious and the near most-parsi-

monious trees. Despite the significant incongruence

found with the partition homogeneity test, the total PBS

indices suggest that all the data sets contribute positively

to the combined analysis and when standardised by the

number of phylogenetic informative characters (COI>EF-1a>Pglym>Cytb), COI provides the greatest

support (54.4%) and Cytb the least (1.8%).

4.2. The effect of adding characters with missing data

Much of the debate on combining phylogenetic data

sets is conducted on theoretical grounds or with com-

plete data sets. Our study is of particular interest due tothe nature of the data set being incomplete with respect

to each gene. Phylogeneticists are frequently faced with

incomplete data sets usually as a result of molecular

laboratory problems. These are usually related to the

quality of the starting template that often degrades.

Sometimes, the starting template is of insufficient

quantity, in particular for nuclear gene amplification

where there is only one copy of the gene per cell. Oc-casionally, the problems are related to the variability of

primer binding sites, or the difficulty of finding the op-

timal annealing time and temperature for certain primer

pairs. As a result, the amount of missing data may vary

greatly from study to study.

In most analyses in which the separate matrices do

not include exactly the same taxa, the authors exclude

the taxa for which one data set is not available withoutany particular explanation except the desire to exclude

missing data. This concern is legitimate, as it has been

widely known that including taxa with missing data may

lead to a dramatic increase in the number of most-par-

simonious trees and an accompanying loss of resolution

in consensus trees (e.g., Novacek, 1992). In our study

also, the number of MPTs increased in two out of three

cases when an extra data set with missing taxa wasadded.

However, deleting taxa a priori may lead to unnec-

essarily discarding useful phylogenetic information (e.g.,

Donoghue et al., 1989; Doyle and Donoghue, 1987;

Gauthier et al., 1988; Novacek, 1992). In the cases

analysed here, although the number of parsimony trees

increased, the total number of resolved nodes in the

combined data analysis was greater than in either of theseparate analysis (Table 4). It is prudent to assess the

effects of incomplete taxa empirically on a case-by-case

basis. As suggested by Wiens and Reeder (1995), there

are three main reasons for including incomplete taxa: (1)

they permit a more comprehensive taxonomic revision

of the group and will provide a more rigorous analyses

of character evolution, (2) they can improve the chances

of estimating the right tree (Barrett et al., 1991; DeQueiroz et al., 1995), (3) and on the principle of total

evidence, that the best hypothesis is the one that ex-

plains all of the relevant data simultaneously (Kluge,

1989).

Wiens (1998) suggests the addition of a set of char-

acters with missing data is generally more likely to in-

crease phylogenetic accuracy but as we have found in

this study the potential benefits of adding these char-acters may decrease as the percentage of missing data

increases. A common theme in all phylogenetic studies is

the desire to obtain better resolution but this should not

be obtained at the expense of ignoring data. In this

study, the resolution and taxonomic scope of the phy-

logeny are improved by combining the incomplete data

sets.

The confounding effect of combining data sets whichconflict make it difficult to determine whether the de-

crease in accuracy is the result of the ambiguities or the

missing data. Moreover, missing data makes measure-

ments of PBS and HBS difficult to interpret and thus,

the support for a particular tree can be unreliable. In-

deed, when a taxon is absent from a data partition, it is

not possible to predict where it will fall out when it is

eventually added and thus may disrupt any relationshipin the most parsimonious trees. It is likely that the

negative HBS in our analysis implying hidden conflict

are exaggerated or spurious. In this case, the cause of

instability is likely to be character conflict between COI

and Cytb rather than missing data but removing the

conflicting characters could obscure the alternative

relationships.

The simultaneous analysis tree appears to performwell in capturing the most consistent signals from the

individual partitions and these results add weight to the

notion that combined analysis of all the data is an ap-

propriate procedure for the treatment of multiple gene

sequences even when the data matrices are incomplete.

We prefer to have a phylogenetic hypothesis for these

incomplete taxa that is mostly right rather than having

no hypothesis for them at all.

4.3. MRP reconstruction

As an alternative to the �total evidence� simultaneous

analysis of combined data, supertrees can be generated

to obtain comprehensive phylogenetic trees. The super-

tree method does not require complete overlap of the

Page 12: The phylogeny of acorn weevils (genus Curculio) from mitochondrial and nuclear DNA sequences: the problem of incomplete data

612 J. Hughes, A.P. Vogler / Molecular Phylogenetics and Evolution 32 (2004) 601–615

terminal taxa but a minimum overlapping set allows themethod to produce more comprehensive phylogenies

than the original ones. In this study, the general topol-

ogy of the combined analysis is supported by the su-

pertree except for a number of rarely sampled taxa

forming polytomies and some discrepancies being found

for individual taxa with conflicting gene evidence. These

polytomies are likely to be caused by the methodology,

i.e., the probable loss of information associated withsummarizing the trees as character data. Given that our

goal is to uncover phylogenetic relationships, this loss of

information may be the most general argument against

the supertree methodology and in favour of analyzing

the combined character data. However, the time re-

quired for building supertrees is minimal in comparison

with the runtime of a multigene matrix in PAUP. Thus,

the supertree methodology may be advantageous insome cases were the individual trees do not conflict and

are all well resolved or for very large data sets that

would otherwise require time consuming analysis of the

total evidence data.

4.4. Curculio phylogeny

While most of the intraspecific clades and a few in-terspecific clades have high bootstrap values and Bremer

support values >8, one internal node exhibits a boot-

strap value of less than 50% and another forms a

polytomy. This phylogenetic tree is thus only a pre-

liminary hypothesis of the relationships of Curculio

species and may be open to reinterpretation in the light

of further sampling. Most previously recognised species

groups (Chittenden, 1926) are not monophyletic in themolecular phylogeny. In the cases where we analysed

more than one taxon from Chittenden�s (1926) species

groupings these formed either polyphyletic or paraphy-

letic groups albeit close relatives were found in prox-

imity in the tree. The nasicus species group is

polyphyletic with the paraphyletic quercugriseae group

embedded within it. Curculio proboscideus, the only

species sampled from the proboscideus group, is basal tothe previous groups. These results tend to support

Gibson�s (1969) decision not to classify the North

American species into the seven separate groups sug-

gested by Chittenden (1926). On the other hand, as the

European species are polyphyletic we suggest dividing

them into an elephas species group and a glandium

group. These groupings are also supported by mor-

phological characteristics specific to each group. Speciesin the elephas group have convex elytra and a dense

vestiture hiding the line of the first ventral segment

whereas the glandium group have flattened elytra and a

first ventral segment clearly visible through the sparse

scales. The molecular phylogeny also suggests that the

fig-feeding Curculio, which represents all the Curculio

species found in Africa are the sister group to the seed

feeding clade. This conclusion will have to be confirmedby sampling more Oriental and African fig-feeding

specimens.

5. Conclusion

In general, phylogeny reconstruction is faced with the

difficulty of how to combine multiple data. We agreewith Gatesy et al. (2002), that the supermatrix is usually

a better summary of phylogenetic data from multiple

sources than the supertree. We have shown that even

though the supertree method produces phylogenies

rapidly and the backbone of the supertree follows that

of the simultaneous analysis tree, the topology is more

informative using the latter approach. Moreover, the

supermatrix clearly reviews which characters are absentfor a particular taxon. The primary data used for the

supermatrix are also explicitly presented and any editing

errors or sequence alignment discrepancies are accessible

for scrutiny by other researchers. By comparison, the

supertree does not clearly indicate the degree to which

relationships are supported and therefore is not very

useful in guiding us for future phylogenetic research.

This study also shows that it is not always necessaryto exclude taxa or character sets a priori when data are

missing because in some cases it is possible to obtain a

resolved phylogeny with a wider taxonomic scope by

adding more characters even if they do contain missing

data. Although the current study is limited in it�s ca-

pacity to assess the accuracy when adding taxa with

missing characters and in determining the amount of

missing data that would significantly affect the phylog-eny, it may direct future studies that do attempt to

comprehensively address these issues. In this study, the

goal was to provide a first accurate phylogeny of

the genus Curculio. Its limitations obviously result from

the narrow sampling confined to a few regional faunas

and to a sub-sample of different life-history strategies,

and from the resolving ability of the sequences selected.

However, the suggested elephas species group, theglandium group and the clade caryae+ nasicus+ confusor

are well supported both in the simultaneous analysis and

the individual data sets creating a basis for future sam-

pling and providing a starting point for phylogenetic

hypothesis-testing in the genus Curculio.

Acknowledgments

We thank Tim Barraclough, Chris Lyal, John Gatesy,

and Anthony Cognato for comments on earlier drafts of

the manuscript and advice on calculating hidden sup-

port. This work was funded by the Natural Environ-

ment Research Council (NERC) and the Natural

History Museum, London.

Page 13: The phylogeny of acorn weevils (genus Curculio) from mitochondrial and nuclear DNA sequences: the problem of incomplete data

J. Hughes, A.P. Vogler / Molecular Phylogenetics and Evolution 32 (2004) 601–615 613

Appendix A

Specimens used in the phylogenetic analysis

Species

Ref.

No.

Locality, Country

Date Collector

C. camelliae

20 Kamigamo, Kyoto, Japan 1-May-01 T. Sota

C. camelliae

49 Kamigamo, Kyoto, Japan 1-May-01 T. Sota

C. caryae

204 Shreveport, Louisiana, USA Fall 2000 A. Cognato C. caryae 205 Shreveport, Louisiana, USA Fall 2000 A. Cognato

C. caryae

208 Shreveport, Louisiana, USA Fall 2000 A. Cognato

C. caryae

209 Shreveport, Louisiana, USA Fall 2000 A. Cognato

C. confusor

12 Washington State Forest, USA 1-Jul-00 J. Hughes

C. confusor

32 Stokes State Forest, New Jersey, USA 1-Jul-00 J. Hughes

C. elephas

213 Saint Just Chalessin, Isere, France 2-Nov-01 F. Menu

C. elephas

216 Saint Just Chalessin, Isere, France 20-Nov-01 F. Menu

C. elephas

218 Saint Just Chalessin, Isere, France 20-Nov-01 F. Menu C. glandium 200 Sopron, Hungaria 8-Sep-01 G. Csoka

C. glandium

99 Foret de Fontainebleau, Seine et Marne, France 13-Aug-01 J&G Hughes

C. humeralis

24 Texas, USA 1-Oct-00 A. Cognato

C. humeralis

55 Texas, USA 1-Oct-00 A. Cognato

C. humeralis

56 Texas, USA 2-Oct-00 A. Cognato

C. iowensis

30 Hart State Park, USA 18-Jul-00 J. Hughes

C. longidens

16 Tom Brown park, USA 1-Jul-00 J. Hughes

C. longidens

19 Ringhaver, USA 1-Jul-00 J. Hughes C. nasicus 14 Black Moshannon State Forest, USA 1-Jul-00 J. Hughes

C. nasicus

18 Canaan Valley, USA 2-Jul-00 J. Hughes

C. nucum

243 St Denis le Thiboult, Seine Maritime, France 16-Jul-00 G. Hughes

C. nucum

177 St Denis le Thiboult, Seine Maritime, France 7-Aug-01 J&G Hughes

C. pardalis

26 Texas, USA 1-Oct-00 A. Cognato

C. pellitus

131 Foret de Dreuille, Allier, France 16-Aug-01 J&G Hughes

C. pellitus

151 St Denis le Thiboult, Seine Maritime, France 7-Aug-01 J&G Hughes

C. pellitus

179 Lyons la Foret, Eure, France 6-Aug-01 J&G Hughes C. pellitus 192 Turkey — M. Barclay

C. proboscideus

22 Texas, USA 1-Oct-00 A. Cognato

C. proboscideus

58 Texas, USA 2-Oct-00 A. Cognato

C. pyrrhoceras

1 UK 1-Oct-99 M. Barclay

C. pyrrhoceras

76 Silwood Park, UK 16-May-00 J. Hughes

C. salicivorus

6 Sartfield Nature Reserve, Isle of Man, UK — M. Barclay

C. scutellaris

5 Hong Kong — J. Mate

C. sulcatulus

23 Texas, USA 1-Oct-00 A. Cognato C. sulcatulus 59 Texas, USA 2-Oct-00 A. Cognato

C. sulcatulus

68 Texas, USA 5-Oct-00 A. Cognato

C. undulatus

86 Turkey — M. Barclay

C. venosus

82 Silwood Park, UK 16-Aug-00 J. Hughes

C. venosus

149 St Denis le Thiboult, Seine Maritime, France 7-Aug-01 J&G Hughes

C. venosus

160 Foret des Bertranges, Nievre, France 14-Aug-01 J&G Hughes

C. venosus

166 St Leger, Haute Vienne, France 17-Aug-01 J&G Hughes

C. victoriensis

25 Texas, USA 1-Oct-00 A. Cognato C. victoriensis 54 Texas, USA 1-Oct-00 A. Cognato

Unidentified�

8 Africa — D.J. Mann

Unidentified�

27 Pakistan — D.J. Mann * These specimens are as yet unidentified due to the lack of available identification keys and voucher specimens in the Natural History Museum.
Page 14: The phylogeny of acorn weevils (genus Curculio) from mitochondrial and nuclear DNA sequences: the problem of incomplete data

614 J. Hughes, A.P. Vogler / Molecular Phylogenetics and Evolution 32 (2004) 601–615

References

Alonso-Zarazaga, M.A., Lyal, C.H.C., Vi~nolas, A., 1999. A World

Catalogue of Families and Genera of Curculionoidea: (Insecta:

Coleoptera): (Excepting Scolytidae and Platypodidae). Entomopr-

axis S.C.P., Spain, Barcelona.

Anderson, R.S., 1995. An evolutionary perspective on diversity in

Curculionoidea. Mem. ent. Soc. Wash. 14, 103–114.

Baker, R.H., DeSalle, R., 1997. Multiple sources of character

information and the phylogeny of Hawaiian Drosophilids. Syst.

Biol. 46, 654–673.

Barker, F.K., Lutzoni, F.M., 2002. The utility of the incongruence

length difference test. Syst. Biol. 51, 625–637.

Barrett, M., Donoghue, M.J., Sober, E., 1991. Against consensus. Syst.

Zoo. 40, 486–493.

Baum, B.R., 1992. Combining trees as a way of combining data sets

for phylogenetic inference, and the desirability of combining gene

trees. Taxon 41, 3–10.

Bremer, K., 1988. The limits of amino-acid sequence data in

angiosperm phylogenetic reconstruction. Evolution 42, 795–803.

Bull, J.J., Huelsenbeck, J.P., Cunningham, C.W., Swofford, D.L.,

Waddell, P.J., 1993. Partitioning and combining data in phyloge-

netic analysis. Syst. Biol. 42, 384–397.

Caterino, M.S., Cho, S., Sperling, F.A.H., 2000. The current state of

insect molecular systematics: a thriving Tower of Babel. Annu.

Rev. Entomol. 45, 1–54.

Chittenden, F.H., 1926. Classification of the nut curculios, (formally

Balaninus) of Boreal America. Entomol. Am. 7, 128–208.

Cho, S.W., Mitchell, A., Regier, J.C., Mitter, C., Poole, R.W.,

Friedlander, T.P., Zhao, S.W., 1995. A highly conserved nuclear

gene for low-level phylogenetics—elongation factor-1 alpha recov-

ers morphology-based tree for Heliothine moths. Mol. Biol. Evol.

12, 650–656.

Cognato, A.I., Sperling, F.A.H., 2000. Phylogeny of Ips DeGeer

species (Coleoptera: Scolytidae) inferred from mitochondrial cyto-

chrome oxidase I DNA sequence. Mol. Phylogenet. Evol. 14, 445–

460.

Cognato, A.I., Vogler, A.P., 2001. Exploring data interaction and

nucleotide alignment in a multiple gene analysis of Ips (Coleoptera:

Scolytinae). Syst. Biol. 50, 758–780.

Collins, J.K., Mulder, P.G., Grantham, R.A., Reid, W.R., Smith,

M.W., Eikenbary, R.D., 1997. Assessing feeding preferences of

pecan weevil (Coleoptera: Curculionidae) adults using a Hardee

olfactometer. J. Kans. Entomol. Soc. 70, 181–188.

Collins, J.K., Mulder, P.G., Smith, M.W., Eikenbary, R.D., 1996.

Mating behavior and peak mating activity of the pecan weevil

Curculio caryae (Horn). Southw. Entomol. 21, 479–481.

Coutin, R., 1960. Estimation de l�importance des populations d�imagos

de Balaninus elephas Gyll. dans une chataigneraie c�evenole. Rev.

Zool. Agric. Appl. 59, 1–5.

Crozier, R.H., Crozier, Y.C., 1993. The mitochondrial genome of the

honeybee Apis-Mellifera—complete sequence and genome organi-

zation. Genetics 133, 97–117.

Cryan, J.R., Wiegmann, B.M., Deitz, L.L., Dietrich, C.H., 2000.

Phylogeny of the treehoppers (Insecta: Hemiptera: Membracidae):

evidence from two nuclear genes. Mol. Phylogenet. Evol. 17, 317–

334.

Currie, P.D., Sullivan, T.D., 1994. Structure, expression and duplica-

tion of genes which encode phosphoglyceromutase of Drosophila

melanogaster. Genetics 138, 352–363.

Danforth, B.N., Ji, S.Q., 1998. Elongation Factor-1 alpha occurs as

two copies in bees: implications for phylogenetic analysis of EF-1

alpha sequences in insects. Mol. Biol. Evol. 15, 225–235.

De Queiroz, A., Donoghue, M.J., Kim, J., 1995. Separate versus

combined analysis of phylogenetic evidence. Annu. Rev. Ecol. Syst.

26, 657–681.

Debouzie, D., Pallen, C., 1987. Spatial distribution of chestnut weevil

Balaninus (¼Curculio) elephas populations. In: Labeyrie, V.,

Fabres, G., Lachaise, D. (Eds.), Insect–Plants. Dr. W. Junk

Publishers, Dordrecht, Printed in the Netherlands, pp. 77–83.

Desouhant, E., 1996. Oviposition in the chestnut weevil Curculio

elephas Gyll (Coleoptera: Curculionidae). Ann. Soc. Entomol. Fr.

32, 445–450.

Desouhant, E., 1998. Selection of fruits for oviposition by the chestnut

weevil, Curculio elephas. Entomol. Exp. Appl. 86, 71–78.

Dolphin, K., Belshaw, R., Orme, C.D.L., Quicke, D.L.J., 2000. Noise

and incongruence: interpreting results of the incongruence length

difference test. Mol. Phylogenet. Evol. 17, 401–406.

Donoghue, M.J., Doyle, J.A., Gauthier, J., Kluge, A.G., Rowe, T.,

1989. The importance of fossils in phylogeny reconstruction. Annu.

Rev. Ecol. Syst. 20, 431–460.

Doyle, J.A., Donoghue, M.J., 1987. The importance of fossils in

elucidating seed plant phylogeny and macroevolution. Rev. Palae-

obot. Palynol. 50, 63–95.

Eikenbary, R.D., Raney, H.G., 1973. Intratree dispersal of the pecan

weevil. Environ. Entomol. 2, 927–930.

Estabrook, G.F., McMorris, F.R., Meacham, C.A., 1985. Comparison

of undirected phylogenetic trees based on subtrees of 4 evolution-

ary units. Syst. Zoo. 34, 193–200.

Farrell, B.D., Sequeira, A.S., O�Meara, B.C., Normark, B.B., Chung,

J.H., Jordal, B.H., 2001. The evolution of agriculture in beetles

(Curculionidae: Scolytinae and Platypodinae). Evolution 55, 2011–

2027.

Farris, J.S., Kallersjo, M., Kluge, A.G., Bult, C., 1994. Testing

significance of incongruence. Cladistics 10, 315–319.

Gatesy, J., Matthee, C., DeSalle, R., Hayashi, C., 2002. Resolution of

a supertree/supermatrix paradox. Syst. Biol. 51, 652–664.

Gatesy, J., O�Grady, P., Baker, R.H., 1999. Corroboration among

data sets in simultaneous analysis: hidden support for phylogenetic

relationships among higher level artiodactyl taxa. Cladistics 15,

271–313.

Gauthier, J., Kluge, A.G., Rowe, T., 1988. Amniote phylogeny and the

importance of fossils. Cladistics 4, 105–209.

Gibson, L.P., 1969. Monograph of the genus Curculio in the New

World (Coleoptera: Curculionidae). Part I. United States and

Canada. Misc. Publs. Ent. Soc. Am. 6, 239–285.

Harris, M.K., 1976. Pecan weevil infestation of pecans of various sizes

and infestations. Environ. Entomol. 5, 248–250.

Hillis, D.M., Huelsenbeck, J.P., Cunningham, C.W., 1994. Application

and accuracy of molecular phylogenies. Science 264, 671–677.

Hoffmann, A., 1954. Coleopt�eres Curculionides (Deuxi�eme Partie).

Faune de France. Paris.

Hovemann, B., Richter, S., Walldorf, U., Cziepluch, C., 1988. Two

genes encode related cytoplasmic elongation factors-1 alpha (EF-

1alpha) in Drosophila Melanogaster with continuous and stage

specific expression. Nucleic Acids Res. 16, 3175–3194.

Huelsenbeck, J.P., 1991. When are fossils better than extant taxa in

phylogenetic analysis. Syst. Zoo. 40, 458–469.

Huelsenbeck, J.P., Bull, J.J., Cunningham, C.W., 1996. Combining

data in phylogenetic analysis. Trends Ecol. Evol. 11, 152–158.

Jermiin, L.S., Crozier, R.H., 1994. The Cytochrome b region in the

mitochondrial–DNA of the ant Tetraponera Rufoniger-sequence

divergence in Hymenoptera may be associated with nucleotide

content. J. Mol. Evol. 38, 282–294.

Jordal, B.H., 2002. Elongation factor 1 alpha resolves the monophyly

of the haplodiploid ambrosia beetles Xyleborini (Coleoptera:

Curculionidae). Insect Mol. Biol. 11, 453–465.

Kishino, H., Hasegawa, M., 1989. Evaluation of the maximum-

likelihood estimate of the evolutionary tree topologies from DNA-

sequence data, and the branching order in Hominoidea. J. Mol.

Evol. 29, 170–179.

Kluge, A.G., 1989. Metacladistics. Cladistics 5, 291–294.

Page 15: The phylogeny of acorn weevils (genus Curculio) from mitochondrial and nuclear DNA sequences: the problem of incomplete data

J. Hughes, A.P. Vogler / Molecular Phylogenetics and Evolution 32 (2004) 601–615 615

Krzywinski, J., Wilkerson, R.C., Besansky, N.J., 2001. Toward

understanding Anophelinae (Diptera, Culicidae) phylogeny: in-

sights from nuclear single-copy genes and the weight of evidence.

Syst. Biol. 50, 540–556.

Loxdale, H.D., Lushai, G., 1998. Molecular markers in entomology.

Bull. Entomol. Res. 88, 577–600.

Martin, H., 1949. Contribution a l��etude du Balanin des noisettes.

Revue Path. veg. 28, 3–28.

Maus, C., Peschke, K., Dobler, S., 2001. Phylogeny of the genus

Aleochara inferred from mitochondrial cytochrome oxidase se-

quences (Coleoptera: Staphylinidae). Mol. Phylogenet. Evol. 18,

202–216.

Menu, F., Debouzie, D., 1995. Larval development variation and adult

emergence in the chestnut weevil Curculio elephas Gyllenhal (Col,

Curculionidae). J. Appl. Entomol.-Z. Angew. Entomol. 119, 279–

284.

Miyamoto, M.M., Fitch, W.M., 1995. Testing species phylogenies and

phylogenetic methods with congruence. Syst. Biol. 44, 64–76.

Nixon, K.C., Carpenter, J.M., 1996. On simultaneous analysis.

Cladistics 12, 221–241.

Novacek, M.J., 1992. Fossils, topologies, missing data, and the higher

level phylogeny of Eutherian mammals. Syst. Biol. 41, 58–73.

Pelsue, F.W., 2000. A review of the genus Curculio L. from

China with descriptions of new taxa. Part I. (Coleoptera:

Curculionidae: Curculioninae: Curculionini). Coleopts Bull. 54,

125–142.

Pelsue, F.W., Zhang, R., 2000. A review of the Curculio from China

with descriptions of new taxa. Part II. The Curculio alboscutellatus

group (Curculionidae: Curculioninae: Curculionini). Coleopts Bull.

54, 467–496.

Penny, D., Hendy, M.D., 1985. The use of tree comparison metrics.

Syst. Zoo. 34, 75–82.

Platnick, N.I., Griswold, C.E., Coddington, J.A., 1991. On missing

entries in cladistic-analysis. Cladistics 7, 337–343.

Ragan, M.A., 1992. Phylogenetic inference based on matrix represen-

tation of trees. Mol. Phylogenet. Evol. 1, 53–58.

Remsen, J., O�Grady, P., 2002. Phylogeny of Drosophilinae (Diptera:

Drosophilidae), with comments on combined analysis and charac-

ter support. Mol. Phylogenet. Evol. 24, 249–264.

Salamin, N., Hodkinson, T.R., Savolainen, V., 2002. Building super-

trees: an empirical assessment using the grass family (Poaceae).

Syst. Biol. 51, 136–153.

Schraer, S.M., Harris, M., Jackman, J.A., Biggerstaff, M., 1998. Pecan

weevil (Coleoptera: Curculionidae) emergence in a range of soil

types. Environ. Entomol. 27, 549–554.

Sequeira, A.S., Farrell, B.D., 2001. Evolutionary origins of Gondwa-

nan interactions: How old are Araucaria beetle herbivores? Biol. J.

Linnean Soc. 74, 459–474.

Sequeira, A.S., Normark, B.B., Farrell, B.D., 2000. Evolutionary

assembly of the conifer fauna: distinguishing ancient from recent

associations in bark beetles. Proc. R. Soc. Lond. Ser. B-Biol. Sci.

267, 2359–2366.

Simon, C., Frati, F., Beckenbach, A., Crespi, B., Liu, H., Flook, P.,

1994. Evolution, weighting, and phylogenetic utility of mitochon-

drial gene-sequences and a compilation of conserved polymerase

chain-reaction primers. Ann. Entomol. Soc. Am. 87, 651–701.

Sorenson, M.D., 1999. TreeRot. Boston University, Boston, MA.

Stone, G.N., Cook, J.M., 1998. The structure of cynipid oak galls:

patterns in the evolution of an extended phenotype. Proc. R. Soc.

Lond. Ser. B-Biol. Sci. 265, 979–988.

Swofford, D.L., 1999. PAUP*. Phylogenetic Analysis Using Parsi-

mony (* and Other Methods). Sinauer Associates, Sunderland,

MA.

Tamura, K., Nei, M., 1993. Estimation of the number of nucleotide

substitutions in the control region of mitochondrial–DNA in

humans and chimpanzees. Mol. Biol. Evol. 10, 512–526.

Templeton, A.R., 1983. Phylogenetic inference from restriction endo-

nuclease cleavage site maps with particular reference to the

evolution of humans and the apes. Evolution 37, 221–244.

Theodorides, K., De Riva, A., Gomez-Zurita, J., Foster, P.G., Vogler,

A.P., 2002. Comparison of EST libraries from seven beetle species:

towards a framework for phylogenomics of the Coleoptera. Insect

Mol. Biol. 11, 467–475.

Thorley, J.L., Page, R.D.M., 2000. RadCon: phylogenetic tree

comparison and consensus. Bioinformatics 16, 486–487.

Vidal, N., Lecointre, G., 1998. Weighting and congruence: a case study

based on three mitochondrial genes in pitvipers. Mol. Phylogenet.

Evol. 9, 366–374.

Vogler, A.P., Knisley, C.B., Glueck, S.B., Hill, J.M., Desalle, R., 1993.

Using molecular and ecological data to diagnose endangered

populations of the Puritan Tiger Beetle Cicindela puritana. Mol.

Ecol. 2, 375–383.

Von Dalla Torre, K.W., Schlenkling, S., 1932. Curculionidae: Subfam.

Curculioninae. In: Junk, W. (Ed.), Coleopterum catalogus. Junk,

Berlin.

Wenzel, J.W., Siddall, M.E., 1999. Noise. Cladistics 15, 51–64.

Wheeler, W.C., Cartwright, P., Hayashi, C.Y., 1993. Arthropod

phylogeny—a combined approach. Cladistics 9, 1–39.

Wiens, J.J., 1998. Does adding characters with missing data increase or

decrease phylogenetic accuracy? Syst. Biol. 47, 625–640.

Wiens, J.J., Reeder, T.W., 1995. Combining data sets with different

numbers of taxa for phylogenetic analysis. Syst. Biol. 44, 548–558.

Wilkinson, M., 1995. Coping with abundant missing entries in

phylogenetic inference using parsimony. Syst. Biol. 44, 501–514.

Yang, Z.H., 1998. On the best evolutionary rate for phylogenetic

analysis. Syst. Biol. 47, 125–133.