Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
THE GENETIC, MOLECULAR, AND EVOLUTIONARY
DISSECTION OF THE TEOSINTE BRANCHED1 GENE
By
Anthony J. Studer
A dissertation submitted in partial fulfillment
of the requirements for the degree of
Doctor of Philosophy
(Genetics)
at the
UNIVERSITY OF WISCONSIN-MADISON
2011
ii
Table of Contents
Acknowledgements Page iii
Abstract Page iv
List of Tables and Figures Page v
Preface Page 1
Chapter 1 Page 7
A transposon insertion in the maize gene, tb1 functions as an
enhancer and was a causative variant for change in plant architecture
during maize domestication.
Chapter 2 Page 45
Do large effect QTLs fractionate?
A case study at the maize domestication QTL teosinte branched1.
Chapter 3 Page 81
Evidence for a natural allelic series at the maize domestication
gene teosinte branched1.
iii
Acknowledgements
I would first like to express my most sincere gratitude to John Doebley for his mentorship. He
has provided me with guidance, support, knowledge, and opportunity without which I would not
be the scientist that I am today. His poise and thoughtfulness will serve as a standard to which I
hope to achieve. He has earned my deepest respect as a scientist and individual. I would also like
to thank my family, above all my supportive wife, Erin. She is the one thing in my life that is
always right, even when everything else seems to be going wrong. I also owe great thanks to my
parents, Mary Ann and Terry, whose love and support from a young age has helped shape who I
am today. My appreciation extends to my undergraduate mentor, Dr. Bernard Mikula, who
helped nurture my young scientific enquires at the very beginning of my career, and still weekly
adds scientific perspective to my sometimes narrow thinking. I also would like to thank the
members of the Doebley lab, especially Bao, Huai, Jesse and Tina. I feel blessed to have found
such close friends. It was a true joy to work and learn alongside them. Last, but certainly not
least, I thank all of my friends, especially Pete. He is like a brother to me, and I am glad that I
had him with me on this journey.
Special thanks also go to my committee, Rick Amasino, Shawn Kaeppler, Patrick
Masson, and Rick Vierstra for their thoughtful advice. I would also like to thank Jeff Ross-Ibarra
and Qiong Zhao for their population genetics contributions to Chapter 1.
iv
Abstract
Maize was domesticated approximately 9,000 years ago from its wild progenitor, teosinte, and
serves as a model for evolution. This research focused on the domestication gene teosinte
branched1 (tb1), which controls morphological differences between maize and teosinte. The
following chapters document the genetic changes which occurred due to selection applied by
ancient farmers during domestication. The results presented here have not only advanced the
study of maize domestication, but also have a broader scientific impact and provide examples of
genetic changes that drive evolution. The causative differences in the upstream regulatory region
of tb1 between maize and teosinte suggest that transposon insertions provide a source of
variation on the molecular level. Insertions such as these produce phenotypic variation that can
be selected on during evolution. Our results from studying the tb1 regulatory differences between
maize and teosinte suggest the genetic architecture underlying natural phenotypic variation can
be more complex than coding region mutations. tb1 provides an example of complex gene
regulation, including long distance cis-regulatory regions and multiple linked QTL with epistatic
interactions. Finally, the study of the natural allelic series at tb1 suggests tb1 was not only
involved in maize domestication, but also the morphological diversification of teosinte taxa.
Taken together these results help us understand sources of genetic and phenotypic variation, and
the genetic architecture that drives evolution.
v
List of Tables and Figure
Chapter 1
Figure 1: Photographs of maize and teosinte plants. Page 15
Figure 2: The phenotypic additive effects for seven Page 19
intervals across the tb1 genomic region.
Figure 3: Sequence diversity in maize and teosinte Page 21
across the control region.
Figure 4: Control region constructs and corresponding Page 26
normalized luciferase expression levels.
Figure S1: Map of introgression lines. Page 33
Figure S2: Phenotypic additive effects for the smallest Page 35
introgression segments.
Table S1: Primer sequences for genotyping. Page 37
Table S2: Germplasm assayed for transposon insertions. Page 38
Table S3: Input values for HKA tests. Page 44
Chapter 2
Figure 1: Map of introgression lines. Page 50
Table 1: Experiment I results. Page 57
Figure 2: NIRIL frequency distributions for least-squares Page 59
trait means.
Figure 3: Map of QTL detected in Experiment II. Page 62
vi
Table 2: Experiment II results. Page 64
Table 3: Experiment II reanalysis. Page 67
Figure 4: Epistatic interactions between tb1 and Page 68
additional linked QTL.
Figure 5: QTL model. Page 74
Table S1: RFLP markers used during backcrossing Page 80
of T1L in Experiment II.
Chapter 3
Figure 1: Map of introgression lines. Page 86
Figure 2: Phenotypic means. Page 90
Figure 3: Additive Effects. Page 92
Figure 4: Principal components plot. Page 99
Figure 5: Phylogentic trees. Page 102
Table S1: Introgressed teosinte germplasm. Page 111
Table S2: Primer sequences for genotyping and sequencing. Page 112
1
Preface
The underlying genetic architecture of variation driving evolution has remained an unanswered
question in biology. Although many evolutionary biologists study variation in natural
populations, their work at the molecular level is limited due to the lack of genetic tools for non-
model systems. The domestication of crop plants has recently been recognized as a valuable
model for natural evolution because it represents an animal-plant interaction similar to the
selective pressures seen in the wild (Purugganan and Fuller 2009). In our research, we use the
domestication of maize (Zea mays ssp. mays) from its wild progenitor, teosinte (Z. mays ssp.
parviglumis), as a model for evolution. The well-developed techniques available in maize, and its
cross compatibility with teosinte, provide an excellent opportunity to study the genetics of
selection which leads to morphological changes (Doebley 2004).
Maize was domesticated approximately 9,000 years ago in the Balsas River Valley in
Mexico (Doebley 2004). During this relatively short time, maize diverged significantly with
respect to both plant and ear architecture. The morphological changes are so striking that it was
long debated whether teosinte was in fact the ancestor of maize (Bennetzen et al. 2001). For
example, teosinte has a highly branched architecture compared to maize, which shows much
greater apical dominance. A teosinte plant has multiple long branches off its main stalk. Each
branch is tipped by a tassel and produces many small ears at its nodes. In contrast, a modern
maize plant has only one or two short branches, each of which is tipped by a large grain-bearing
ear. The difference in size between a teosinte and maize ear is substantial. The small ears of
teosinte have only 10 or 12 grain, while a single ear of maize can have 300 or more kernels.
These differences in morphology are characteristic of domestication, which converts a plant
2
adapted to grow in the wild, into a plant adapted to thrive in a cultivated environment (Doebley
et al. 2006). Furthermore, the phenotypic changes driven by ancient farmers were modified
without knowledge of the genetic factors controlling these traits.
teosinte branched1 (tb1) is one of the genes controlling the morphological differences
between maize and teosinte. tb1 was first identified as a large effect quantitative trait locus
(QTL) on the long arm of chromosome 1, and affects both plant and ear morphology (Doebley
and Stec 1991, 1993). The QTL at tb1 was later cloned and the maize allele was shown to be
expressed at twice the level of the teosinte allele in developing branches and in immature ears
(Doebley et al. 1997). Clark et al. fine-mapped the QTL causing the expression level difference
to a 12 kilobase (kb) control region located between 58.7 kb and 69.5 kb upstream of the tb1
open reading frame (ORF) (2006). Allele specific assays were used to show that this 12 kb
segment functions as a long distance cis-regulatory region. These previous studies provide
evidence that the morphological divergence of maize from teosinte is due to a change in the
regulation of tb1 rather than a change in the coding region. However, the causative difference in
the control region has yet to be identified.
The expression pattern and function of tb1 have also been elucidated. tb1 encodes a
basic-Helix-Loop-Helix transcription factor, which is a member of the TCP (TB1, CYC, PCFs)
family of transcriptional regulators (Cubas et al. 1999; Lukens and Doebley 2001). The
expression of tb1 in axillary buds represses organ growth and thereby reduces outgrowth of
axillary branches (Hubbard et al. 2002). Conserved expression patterns and repressor function
have been reported in many other plant species including Arabidopsis, bamboo, barley, rice,
sorghum, and wheat (Finlayson 2007; Peng et al. 2007; Ramsay et al. 2011; Takeda et al. 2003;
Kebrom et al. 2006; Lewis et al. 2008). Correlations between tb1 expression levels and
3
branching were also observed in these studies. The conservation of tb1 expression, function, and
sequence suggests its vital role in plant morphology (Lukens and Doebley 2001; Mondragon-
Palomino and Trontin 2011).
This work focuses on the differences between tb1 in maize and teosinte. By studying the
differences at tb1 selected on during domestication, it is possible to gain insight into the types of
changes at the molecular level that lead to whole plant phenotypes necessary for adaptive
evolution. Several different strategies were employed to study tb1 including molecular,
quantitative, and evolutionary genetic techniques. The research presented here has been
organized into three chapters.
1) The first chapter focuses on the upstream control region of tb1, which regulates the two
fold increase in expression seen in maize compared to teosinte. This work began with the
fine-mapping of the control region previously described. We used recombination
mapping to reduce introgressed segments of a teosinte long arm of chromosome 1 to only
a few hundred base-pairs in an otherwise all maize background. These efforts defined two
independent components of the control region. Sequences from a diverse sample of maize
and teosinte were used to assay nucleotide diversity and evidence for selection in the
control region. From this, four fixed differences were identified between the common
maize and teosinte haplotypes, two of which are transposon insertions in maize. Transient
assays in maize protoplasts were used to test these fixed differences for function. This
experiment identified a repressor sequence in the control region and shows that one of the
transposon insertions enhances gene expression. All results are consistent with observed
differences between maize and teosinte with respect to tb1 expression levels and whole
4
plant phenotypes. Molecular dating of the transposons place the time of insertion prior to
domestication, suggesting selection acted on standing genetic variation instead of new
mutation.
2) The second chapter addresses the possibility of the large effect QTL identified at tb1
fractionating into additional tightly linked QTL. Two experiments were used to ask: 1) is
the upstream control region of tb1 the only segment on the chromosome arm affecting
plant and ear morphology, and 2) where do these additional QTL map? We provide
evidence for single QTL at tb1 affecting plant architecture traits. In addition to tb1, four
linked QTL were found which affect ear morphology. Two of these additional QTL act
epistatically with tb1. These results provide evidence that single QTL peaks do not
always correspond to single genes.
3) The final chapter investigates the allelic series at tb1 conferred by nine teosinte
introgressions into a maize isogenic background. This experiment goes beyond statistical
inference, and controls other possible sources of variation to provide strong evidence for
a natural allelic series at tb1. Unlike classic natural allelic series which control simple
phenotypes, the variation at tb1 is an example of a naturally occurring series for
morphological traits. The alleles separate into distinct phenotypic classes corresponding
to the taxonomic origin of the teosinte introgressions. Moreover, these classes also
correspond to known morphological differences between the teosinte taxa. These results
suggest tb1 was not only involved in maize domestication, but also the morphological
divergence between teosintes.
5
References
Bennetzen, J., E. Buckler, V. Chandler, J. Doebley, J. Dorweiler, et al., 2001 Genetic evidence
and the origin of maize. Latin Amer. Antiquity 12: 84-86.
Clark, R. M., T. Nussbaum Wagler, P. Quijada, and J. Doebley, 2006 A distant upstream
enhancer at the maize domestication gene tb1 has pleiotropic effects on plant and
inflorescent architecture. Nat. Genet. 38: 594-597.
Cubas, P., N. Lauter, J. Doebley, and E. Coen, 1999 The TCP domain: a motif found in proteins
regulating plant growth and development. The Plant Journal 18: 215-222.
Doebley, J. and A. Stec, 1991 Genetic analysis of the morphological differences between maize
and teosinte. Genetics 129: 285-295.
Doebley, J. and A. Stec, 1993 Inheritance of the morphological differences between maize and
teosinte: comparison of results for two F2 populations. Genetics 134: 559-570.
Doebley, J., A. Stec, and L. Hubbard, 1997 The evolution of apical dominance in maize. Nature
386: 485-488.
Doebley, J., 2004 The genetics of maize evolution. Annu. Rev. Genet. 38: 37-59.
Doebley, J. F., B. S. Gaut, and B. D. Smith, 2006 The molecular genetics of crop domestication.
Cell 127: 1309-1321.
Finlayon, S. A., 2007 Arabidopsis TEOSINTE BRANCHED1-LIKE 1 regulates axillary bud
outgrowth and is homologous to monocot TEOSINTE BRANCHED1. Plant Cell
Physiol. 28: 667-677.
Hubbard, L., P. McSteen, J. Doebley, and S. Hake, 2002 The expression patterns and mutant
6
phenotype of teosinte branched1 correlate with growth suppression in maize and teosinte.
Genetics 162: 1927-1935.
Kebrom, T. H., B. L. Burson, and S. A. Finlayson, 2006 Phytochrome B represses teosinte
branched1 expression and induces sorghum axillary bud outgrowth in response to light
signals. Plant Physiol. 140: 1109-1117.
Lewis, J. M., C. A. Mackintosh, S. Shin, E. Gilding, S. Kravchenko, et al., 2008 Overexpression
of the maize Teosinte Branched1 gene in wheat suppresses tiller development. Plant Cell
Rep. 27: 1217-1225.
Lukens, L. N. and J. Doebley, 2001. Molecular evolution of the teosinte branched gene among
maize and related grasses. Mol. Biol. Evol. 18: 627-638.
Mondragon-Palomino, M. and C. Trontin, 2011 High time for a roll call: gene duplication and
phylogenetic relationships of TCP-like genes in monocots. Ann. Bot.
doi: 10.1093/aob/mcr059.
Peng, H.- Z., E.- P. Lin, Q.- L. Sang, S. Yao, Q.- Y. Jin, et al., 2007 Molecular cloning,
expression analysis and primary evolution studies of REV- and TB1-like genes in
bamboo. Tree Physiol. 27: 1273-1281.
Purugganan, M. D., and D. Q. Fuller, 2009 The nature of selection during plant domestication.
Nature 457: 843-848.
Ramsay, L., J. Comadran, A. Druka, D. F. Marshall, W. T. B. Thomas, et al. 2011
INTERMEDIUM-C, a modifier of lateral spikelet fertility in barley, is an ortholog of the
maize domestication gene TEOSINTE BRANCHED1. Nat. Genet. 43: 169-173.
7
Chapter 1
A transposon insertion in the maize gene, tb1 functions as an
enhancer and was a causative variant for change in plant
architecture during maize domestication
8
Abstract
Maize and its progenitor, teosinte, show a striking difference in plant architecture that is partially
governed by a regulatory element or “control region” in the tb1 gene. We show that the tb1
control region is complex, having two components with independent effects on the plant
phenotypes that distinguish maize and teosinte. The common maize haplotype for the control
region possesses two transposable element insertions (Hopscotch and Tourist) that are not found
in the common teosinte haplotypes. Using transient expression assays, we show that the
Hopscotch insertion acts as an enhancer of gene expression, consistent with the higher level of
tb1 expression seen in maize. Molecular dating indicates that the Hopscotch and Tourist
insertions predate maize domestication by at least 10,000 years, indicating that selection acted on
standing variation rather than new mutation. Our results highlight how transposons can
contribute to evolution and domestication through alterations in gene regulation.
9
Introduction
Multiple lines of experimentation suggest that genetic diversity created by transposable elements
(TEs) is an important source of functional variation upon which selection acts during evolution
(Naito et al. 2009; Xiao et al. 2008; White et al. 1994; Bejerano et al. 2006; Mackay et al. 1992;
Torkamanzehi et al. 1992). TEs have been associated with adaptation to temperate climates in
Drosophila (Gonzalez et al. 2010), a SINE element insertion has been associated with the
domestication of small dog breeds from the grey wolf (Gray et al. 2010), and there is even
evidence that TEs were targets of selection during human evolution (Britten 2010). While
examples of TEs associated with host gene function continue to grow, formal proof that TEs are
causative and not just correlated with functional variation is limited. Here, we show that a TE
(Hopscotch) insertion in a regulatory region of the maize domestication gene, tb1 acts as an
enhancer of gene expression and partially explains the increased apical dominance in maize as
compared to its wild progenitor, teosinte. Molecular dating indicates that the Hopscotch insertion
predates maize domestication by at least 10,000 years, indicating that selection acted on standing
variation rather than new mutation.
Materials and Methods
Plant materials
Nine introgression lines generated by Clark and colleagues (2006) were used for fine-mapping
the control region (Figure S1). Nine additional introgression lines were recovered using the same
strategy as used by Clark and colleagues (2006). This entailed backcrossing homozygous
introgression lines to W22, and screening individual F2 progeny for cross-overs in the
10
introgressed chromosomal segment. Genotyping was accomplished using a set of eight PCR-
based indel markers (GS1-8, Table S1), tagged with a 5' HEX or FAM label and then assayed
using an ABI 3700 fragment analyzer.
Phenotypic and genotypic data collection
Plants were grown at the University of Wisconsin West Madison Agricultural Research Station,
Madison, WI, USA during summers of 2006-2009. F2 seed derived from the cross of each of the
homozygous introgression lines to W22 were planted in a completely randomized design using
grids with 0.9 meter spacing between plants in both dimensions. This spacing minimized the
degree to which plants shaded their neighbors. The following three traits were phenotyped:
cupules per rank (CUPR; number of cupules in a single rank from base to the tip of the ear),
lateral branch length (LBLH; length, in cm, of uppermost lateral branch), and tillering (TILL; the
ratio of the sum of tiller heights/plant height). All plants were genotyped individually using a
combination of the eight PCR based markers described above.
Phenotypic data analysis
The tb1 genomic region was divided into 16 segments based on the recombination breakpoints of
the 18 introgression lines. To examine the near-colinearity of these segments with one another,
the CORR procedure of SAS was used to calculate the correlation coefficient between segments.
If the correlation between two segments is high then the model will not adequately fit the two
segments simultaneously. Thus not all segments can be used in a single model. Only if two
segments show a correlation coefficient less than 0.8 were they included as separate factors in
the model. Our final analysis included seven segments which represent the entire tb1 genomic
11
region. The number of plants included for each trait model are as follows: 5491 (TILL), 4591
(LBIL), and 3499 (CUPR). The MIXED procedure of SAS was used to test each segment for an
effect on phenotype. Segments (1-7) were considered fixed effects while year (2006-2009), the
introgression line by ear interaction term, and the introgression line by segment interaction terms
were treated as random effects. The linear model used was
Yhijkl = + ah + bi + cj*dk + cj*ah + ehijkl
where Yhijkl is the trait value for the lth
plant with hth
segments from the kth
ear of the jth
introgression line in the ith
year, is the overall mean of the experiment, ah is the segment effect,
bi is the year effect, cj*dk is the introgression line by ear interaction, cj*ah is the introgression line
by segment interaction, and ehijkl is the sampling error. The random effects of this full model were
subjected to the Likelihood Ratio Test for significance for each trait. Effects that were not
significant were dropped from the model on a trait by trait basis.
Nucleotide diversity
A sample of 16 maize landraces made haploid for DNA extraction (Tenaillon et al. 2001) and 17
inbred teosinte lines were used to assay nucleotide diversity in the control region (Table S2).
Sequencing of PCR fragments for the 33 individuals was done using standard PCR conditions
and Applied Biosystems BigDye kit at the University of Wisconsin Biotechnology Center using
Sanger sequencing methods. Initial alignment of nucleotide sequences was performed using
ClustalW (Thompson et al. 1994) and then finished by hand. Nucleotide diversity (π) was
12
calculated using a 500 base-pair sliding window with a 25 base-pair step with a correction for
small sample size. Nucleotide sites in the alignment were only used for calculating π if at least
ten individuals had ungapped/unambiguous calls from the maize and from the teosinte groups.
Tests for neutrality
The HKA tests (Hudson et al. 1987) for neutrality were performed using DnaSP (Rozas et al.
2003). Zea diploperennis was used as an outgroup, and its sequence was aligned with that of the
33 individuals used in the nucleotide diversity survey. A set of six previously described (Zhao et
al. 2010) neutral loci were used as control genes (Table S3). For each HKA test, an overall 2
value was calculated by taking the sum of the individual 2 values calculated for the six
individual neutral loci. These overall 2 values were then used to obtain overall P-values.
Minimum spanning tree
The minimum spanning tree was constructed using the same 33 individuals as used in the
nucleotide diversity survey (Table S2). The alignment of the sequences was trimmed of gaps and
missing data and then imported into Arlequin version 3.5 (Excoffier and Lischer 2010), which
was used to define the haplotypes and calculate the minimum spanning tree among haplotypes.
Arlequin's distance matrix output was used in Hapstar (Teacher and Griffiths 2010) to draw the
minimum spanning tree.
13
Insertion frequencies
The frequency of Tourist and Hopscotch insertions were calculated using a diverse set of 139
maize chromosomes and 148 teosinte chromosomes (Table S2). The frequency of each insertion
was assayed using a three primer PCR reaction (Table S1) which allowed both homozygous and
heterozygous individuals to be scored on a 2% agarose gel using standard PCR conditions.
Insertion dating
Initial alignment of nucleotide sequences of the Tourist and Hopscotch elements was performed
using ClustalW (Thompson et al. 1994) and then finished by hand. Diversity analyses were
performed using the “compute” function of the analysis package of libsequence (Thornton 2003).
Of the 16 maize alleles sequenced, 15 have the Hopscotch insertion. These 15 alleles had 2
insertion/deletions and 16 segregating sites, of which 13 were singleton mutations and three were
found in two sequences. If we assume a star phylogeny, we can estimate the time since insertion
of the Hopscotch as T=S(15µL)-1
, where T is time in generations, S is the number of segregating
sites, L is the length of the sequence in bp and µ is the per generation mutation rate per bp. Given
1524bp of sequence, 16 segregating sites, a generation time of one year, and a mutation rate of
3x10-8
(Clark et al. 2005), this gives an estimate of approximately 23,300 years. While these
assumptions (star phylogeny, mutation rate, ignoring doubletons) are clearly unrealistic,
changing any of them leads to an increase in the estimated time of insertion.
Protoplast transient assays
Two reporter constructs were developed for the transient assays. A reporter construct containing
the Cauliflower Mosaic Virus (CaMV) 35S minimal promoter (Benefey and Chua 1990) driving
14
expression of the firefly luciferase gene was used for testing control region segments. The
second reporter containing the rice actin1 promoter driving expression of the Renilla luciferase
gene was used as an internal transformation control. Transient expression assays using maize
mesophyll protoplasts were performed following a detailed protocol from the Sheen lab (Sheen
2002) with transformation condition modified as follows. Briefly, 2-4 x 105 freshly isolated
protoplasts in 400 µl electroporation buffer were mixed with 50 µl of plasmids. The protoplast-
plasmids mixes were transferred into 0.5 ml cuvettes and electroporated with Gene Pulser II
Electroporation System (Bio-Rad) set at 250 volts. Each sample received 3 pulses of 1.5 msec
each with a 20 sec pause between the pulses. After electroporation, protoplasts were incubated
for 18 hrs at 25 °C and then harvested. The harvested protoplasts were lysed with CCLR (Cell
Culture Lysis Reagent, Promega) and assayed using a Dual-Luciferase Reporter Assay System
(Promega) following manufacturer’s instruction. 4-6 biological replicates, each with two
technical replicates, were assayed per construct.
Results and Discussion
During its domestication, maize underwent a dramatic transformation in both plant and
inflorescence architecture as compared to its wild progenitor, teosinte (Doebley 2004). Like
many wild grasses, teosinte has a highly branched architecture (Figure 1). The main stalk of a
teosinte plant has multiple long branches, each tipped by a tassel and bears many small ears of
grain at its nodes. By comparison, the stalk of a modern maize plant has only one or two short
branches, each of these tipped by the large grain-bearing ears. The difference in size of the
teosinte and maize ears is substantial. The small ears of teosinte have only 10 or 12 grain, while a
15
Figure 1: Teosinte and maize plants. a, Highly branched teosinte plant and b, a teosinte lateral
branch with terminal tassel. c, Unbranched maize plant and d, maize ear shoot (i.e. lateral
branch).
16
a c db
17
single ear of maize can have 300 or more kernels. Overall, maize shows much greater apical
dominance with the development of the branches repressed relative to the development of the
main stalk.
The teosinte branched1 (tb1) gene is a major contributor to the increase in apical
dominance during maize domestication. tb1 was first identified as a quantitative trait locus
(QTL) (Doebley et al. 1995), and subsequently shown to encode a member of the TCP family of
transcriptional regulators (Cubas et al. 1999). tb1 acts as a repressor of organ growth, and
thereby contributes to apical dominance by repressing the outgrowth of branches. Prior research
has shown that the maize allele of tb1 is expressed more highly than the teosinte allele, thereby
conditioning greater repression of branching (Doebley al. 1997). The regulatory element or
“control region” modulating this difference in expression is located between 58.7 kb and 69.5 kb
upstream of the tb1 open reading frame (ORF) (Clark et al. 2006). Although the region
containing the causative factor distinguishing maize and teosinte was narrowed to this ~11 kb
interval, the nature of this factor, whether simple or multipartite, and the identity of the exact
causative polymorphism(s) have not been elucidated.
We used genetic fine-mapping to locate the factors influencing phenotype in the control
region. We isolated 18 maize-teosinte recombinant chromosomes, each containing a unique
teosinte portion of the tb1 genomic region, and we made these 18 recombinant chromosomes
isogenic in a common maize inbred background (Figure S1). This collection of recombinant
chromosomes enabled us to divide the tb1 genomic region into seven segments based on
recombination breakpoints. The isogenic lines for these recombinant chromosomes were
evaluated over four growing seasons and the phenotypes of more than 5500 plants recorded. The
resulting data were analyzed using a mixed linear statistical model, enabling us to test each
18
segment for an affect on phenotype. This analysis confirmed that the control region previously
described by Clark and colleagues (2006) is responsible for differences in both plant and ear
architecture between maize and teosinte (Figure 2). Moreover, our data show that the control
region is complex, having two independent components effecting phenotype. These two
components, which we call the proximal and distal components, are separated by recombination
breakpoints located ~63.9 kb upstream of the tb1 ORF. The independent phenotypic effects of
the proximal and distal components are readily seen in lines that segregate for only one or the
other of these components (Figure S2).
Previous analyses indicated that the tb1 genomic region shows evidence for a selective
sweep during domestication that extends from the ORF to -58.6 kb but ends before -93.4 kb
(Clark et al. 2004). To better define the extent of the sweep, we performed population genetic
analyses for the region between -57.4 and -67.6 kb using a diverse set of maize and teosinte
lines. Nucleotide diversity () at -58 kb is high in teosinte but low in maize (Figure 3a). Between
-58 and -65 kb, nucleotide diversity is low in both maize and teosinte, but lower in maize. The
low diversity for both maize and teosinte in this region suggests that this region is evolving under
functional constraint. Beyond 65 kb upstream of the ORF, diversity rises in both maize and
teosinte. The rise in nucleotide diversity in maize beyond -65 kb suggests that the selective
sweep ends near this point.
We applied the HKA test (Hudson et al. 1987) to address whether individual segments of
the control region show evidence of past selection. Our results confirm previous findings (Zhao
2006) that the region from -65.6 kb to -67.6 kb (segments A and B in Figure 3) does not depart
significantly from neutral expectations, but that the neutral model can be rejected for the region
from -58.8 to -57.4 (segment D). We also tested, for the first time, an additional segment
19
Figure 2: The phenotypic additive effects for seven intervals across the tb1 genomic region. The
horizontal axis represents the tb1 genomic region to scale. Base-pair positions are relative to
AGPv2 position 256,745,977 of the maize reference genome sequence. The tb1 ORF as well as
the nearest upstream predicted gene (PG3) are shown. The previously defined control region
(CR) (Clark et al. 2006) is shown in red, and is divided into its proximal and distal components.
Vertical columns represent the additive effects shown with standard error bars for each of the
three traits in each of the seven intervals that were tested for an effect on phenotype. Black
columns are statistically significant [P(Bonferroni)<0.05]; white bars are not statistically
significant [P(Bonferroni)>0.05].
20
21
Figure 3: Sequence diversity in maize and teosinte across the control region. a, Nucleotide
diversity across the tb1 upstream control region. Base-pair positions are relative to AGPv2
position 256,745,977 of the maize reference genome sequence. P-values correspond to HKA
neutrality tests for regions A through D as defined by the dotted lines. Green shading signifies
evidence for neutrality, and red shading signifies regions of non-neutral evolution. Nucleotide
diversity () for maize (yellow line) and teosinte (green line) were calculated using a 500 base
pair sliding window with a 25 base pair step. The distal and proximal components of the control
region with four fixed sequence differences between the most common maize haplotype and
teosinte haplotypes are shown below. b, A minimum spanning tree for the control region with 16
diverse maize and 17 diverse teosintes sequences. Size of the circles for each haplotype group
(yellow, maize; green, teosinte) is proportional to the number of individuals with that haplotype.
22
4885 bp
Hopscotch
408 bp
Tourist
M
T
0
0.02
0.04
0.06
HKA Neutrality Tests
P = 0.95 P = 0.41 P = 0.04 P = <0.0001
-67kb -66kb -65kb -64kb -58kb
A B C D
a.
Distal
Component
Proximal
Component
b.
Maize ClusterHaplotype
Teosinte ClusterHaplotype
23
(segment C, from -65.6 to -63.7 kb) in the middle of the control region, which our data show
significantly rejects the neutral model. Prior results (Clark et al. 2004) demonstrated that the
sweep extends from -58 kb to the tb1 ORF; thus overall, the sweep includes approximately 65.6
kb from the control region to the ORF.
Phenotypic fine-mapping with the recombinant chromosomes indicated that the factors
controlling phenotype lie between 58.7kb and 69.5kb upstream of the ORF. Population genetic
analysis indicates that the selective sweep extends only to -65.6 kb. Together, these two sources
of information suggest that the causative polymorphisms lie between -58.7 and -65.6 kb of the
ORF. We looked in greater detail at sequence diversity for maize and teosinte in the ~7 kb
segment that these two methods define. A minimum spanning tree for a sample of 16 diverse
maize and 17 diverse teosintes in this region revealed two distinct clusters of haplotypes – one
composed mostly of maize sequences and the other composed mostly of teosinte sequences
(Figure 3b). We designated these clusters as the maize cluster haplotype (MCH) and teosinte
cluster haplotype (TCH). There are four fixed differences between the sequences in the maize
and teosinte clusters (Figure 3a). Two of these fixed differences were single nucleotide
polymorphisms, and two were large insertions in maize cluster haplotype relative to the teosinte
cluster haplotype. A BLAST search of the two insertion sequences revealed that one is a
Hopscotch retrotransposon and the other is a Tourist MITE (Miniature Inverted-repeats
Transposable Element). These TE insertions are located in the proximal (Hopscotch) and distal
(Tourist) components of the control region as delineated by phenotypic fine-mapping.
To estimate the frequency of the two haplotype groups in maize and teosinte, we assayed
139 additional diverse maize chromosomes and 148 additional diverse teosinte chromosomes
(Table S2). For this purpose, we used the Hopscotch and Tourist insertions as markers for the
24
haplotype groups. The MCH is present in >95% of the maize chromosomes assayed but <5% of
teosinte chromosomes. The fact that the MCH is not fixed in maize suggests either that the initial
selective sweep was not complete or that post domestication gene flow from teosinte to maize
has reintroduced the TCH into the maize gene pool. Correspondingly, the presence of the MCH
in teosinte may represent either a haplotype variant that existed in teosinte prior to domestication
or post domestication gene flow from maize into teosinte, which is known to occur (Fukunaga
2005).
Inspection of the sequence alignment of the Hopscotch-Tourist region suggests that the
two insertions differ in relative age. The Tourist has accumulated greater nucleotide diversity
(=0.0054) since insertion, including a pair of sites that fail the four-gamete test, indicative of
recombination among Tourist sequences. Nucleotide diversity in the Hopscotch insertion is much
lower (=0.0016) and shows no evidence of past recombination. These observations point to the
Hopscotch insertion being more recent than the Tourist. Our sequences do show evidence of
recombination between the Hopscotch and a SNP in the flanking sequence between the two
insertions, likely explaining how the Hopscotch insertion has come to be associated with
multiple alleles of the Tourist element.
These nucleotide diversity data allow us to ask whether the Hopscotch insertion arose
during or prior to domestication. We estimated the time to most recent common ancestor
(TMRCA) of the Hopscotch alleles assuming strong directional selection and a star phylogeny.
Using a relatively high mutation rate (3x10-8
/bp/yr) (Clark et al. 2005), this method yields an
estimate (~23,000 years) much older than the time since domestication (~9,000 BP) (Doebley
2004), suggesting the Hopscotch insertion (and thus the older Tourist as well) existed as standing
25
genetic variation in the teosinte ancestor of maize. Violation of the assumptions of strong
selection, a high mutation rate, and a star phylogeny would lead to an underestimate of the
TMRCA and are in this sense conservative. Thus, we conclude that the Hopscotch insertion
predated the domestication process by more than 10,000 years and that the Tourist insertion is
even older.
Having identified only four fixed differences in the proximal and distal components of
the control region, we used transient assays in maize leaf protoplasts to test all four differences
for effects on gene expression. Maize and teosinte chromosomal segments for the proximal and
distal components of the control region were cloned into reporter constructs 5' of the minimal
promoter of the cauliflower mosaic virus (mpCaMV), the firefly luciferase ORF, and the
nopaline synthase (NOS) terminator (Figure 4). Each construct was assayed for luminescence
after transformation by electroporation into maize protoplast. The constructs for the distal
component contrast the effects of the Tourist insertion plus the single fixed nucleotide
substitution that distinguish maize and teosinte. Both the maize and teosinte constructs for the
distal component repressed luciferase expression relative to the minimal promoter alone. The
maize construct with the Tourist excised gives luciferase expression equivalent to the native
maize and teosinte constructs and less expression than the minimal promoter alone. These results
indicate that this segment is functionally important, acting as a repressor of luciferase expression
and by inference of tb1 expression in vivo. However, we did not observe any difference between
the maize and teosinte constructs as anticipated. One possible cause for the failure to see an
expression difference between the maize and teosinte constructs would be that additional
proteins required to reveal the difference are not present in maize leaf protoplast. Nevertheless,
the results do indicate that the distal component has a functional element, acting as a repressor.
26
Figure 4: Constructs and corresponding normalized luciferase expression levels. Transient
assays were performed in maize leaf protoplasts. Each construct is drawn to scale. The construct
backbone consists of the minimal promoter from the cauliflower mosaic virus (mpCaMV, grey
box), luciferase ORF (luc, white box), and the nopaline synthase terminator (black box).
Proximal and distal components of the control region (hatched boxes) from maize and teosinte
were cloned into restriction sites 5' of the minimal promoter. “” denotes the excision of either
the Tourist or Hopscotch element from the maize construct. Horizontal green bars show the
normalized mean with standard error bars of each construct.
27
28
The functional importance of this segment is supported by its low level of nucleotide diversity
(Figure 3a), suggesting a history of purifying selection.
The constructs for the proximal component of the control region contrast the effects of
the Hopscotch insertion plus a single fixed nucleotide substitution that distinguish maize and
teosinte. The construct with maize sequence including the Hopscotch increased expression of the
luciferase reporter two-fold relative to the teosinte construct for the proximal control region and
the minimal promoter alone (Figure 4). Luciferase expression was returned to the level of the
teosinte construct and the minimal promoter construct by deleting the Hopscotch element from
the full maize construct. These results indicate that the Hopscotch element enhances luciferase
expression and by inference tb1 expression in vivo. They also indicate that the Hopscotch rather
than the fixed SNP difference between maize and teosinte is the causative polymorphism. The
observed enhancement of gene expression by the Hopscotch element is consistent with the
known higher level of tb1 expression in maize as compared to teosinte.
Our observation of a TE providing an enhancer element in tb1 is similar to that observed
with globin genes in primates in which an EVR-9 element has been shown to function as a long
distance enhancer of gene expression (Pi et al. 2010). Similarly, in Drosophila, the LTR of an
Accord element acts an enhancer of Cyp6g1, which metabolizes DDT, thereby conferring
pesticide resistance (Chung et al. 2007; Schmidt et al. 2010). Over 25 years ago, McClintock
proposed that transposable elements represent a key source of variation for evolution (1984).
Remarkably, a transposable element insertion appears to represent the causal variant for one of
the key steps in the domestication of maize, the organism in which McClintock discovered
transposable elements.
29
References
Bejerano, G., C. B. Lowe, N. Ahituv, B. King, A. Siepel, et al., 2006 A distal enhancer and an
ultraconserved exon are derived from a novel retrotransposon. Nature 441: 87-90.
Benefey, P. N., and N. Chua, 1990 The cauliflower mosaic virus 35S promoter: combinatorial
regulation of transcription in plants. Science 250: 959-966.
Britten, R. J., 2010 Transposable element insertions have strongly affected human evolution.
Proc. Natl. Acad. Sci. U.S.A. 107: 19945-19948.
Chung, H., M. R. Bogwitz, C. McCart, A. Adrianopoulos, R. H. ffrench-Constant, et al., 2007
Cis-regulatory elements in the Accord retrotransposon result in tissue-specific expression
of the Drosophila melanogaster insecticide resistance gene Cyp6g1. Genetics 175: 1071-
1077.
Clark, R. M., E. Linton, J. Messing, and J. F. Doebley, 2004 Pattern of diversity in the genomic
region near the maize domestication gene tb1. Proc. Natl. Acad. Sci. U.S.A. 101: 700-
707.
Clark, R. M., S. Tavare, and J. Doebley, 2005 Estimating a nucleotide substitution rate for maize
from polymorphism at a major domestication locus. Mol. Biol. Evol. 22: 2304-2312.
Clark, R. M., T. Nussbaum Wagler, P. Quijada, and J. Doebley, 2006 A distant upstream
enhancer at the maize domestication gene tb1 has pleiotropic effects on plant and
inflorescent architecture. Nat. Genet. 38: 594-597.
Cubas, P., N. Lauter, J. Doebley, and E. Coen, 1999 The TCP domain: a motif found in proteins
regulating plant growth and development. The Plant Journal 18: 215-222.
Doebley, J., A. Stec, and C. Gustus, 1995 teosinte branched1 and the origin of maize: Evidence
for epistasis and the evolution of dominance. Genetics 141: 333-346.
30
Doebley, J., A. Stec, and L. Hubbard, 1997 The evolution of apical dominance in maize. Nature
386: 485-488.
Doebley, J., 2004 The genetics of maize evolution. Annu. Rev. Genet. 38: 37-59.
Excoffier, L., and H. E. L. Lischer, 2010 Arlequin suite ver 3.5: a new series of programs to
perform population genetics analyses under Linux and Windows. Mol. Ecol. Resources
10: 564-567.
Fukunaga K., J. Hill, Y. Vigouroux, Y. Matsuoka, J. Sanchez, et al., 2005 Genetics diversity and
population structure of teosinte. Genetics 169: 2241-2254.
Gonzalez, J., T. L. Karasov, P. W. Messer, and D. A. Petrov, 2010 Genome-wide patterns of
adaptation to temperate environments associated with transposable elements in
Drosophila. PLOS Genet. 6: e10000905.
Gray, M. M., N. B. Sutter, E. A. Ostrander, and R. K. Wayne, 2010 The IGF1 small dog
haplotype is derived from Middle Eastern grey wolves. BMC Biol. 8: 16.
Hudson, R. R., M. Kreitman, and M. Aguade, 1987 A test of neutral molecular evolution based
on nucleotide data. Genetics 116: 153-159.
Mackay, T. F. C., R. F. Lyman, and M. S. Jackson, 1992 Effects of P element insertions on
quantitative traits in Drosophila melanogaster. Genetics 130: 315-332.
McClintock, B., 1984 The significance of responses of the genome to challenge. Science 226:
792-801.
Naito, K., F. Zhang, T. Tsukiyama, H. Saito, N. C. Hancock, et al., 2009 Unexpected
consequences of a sudden and massive transposon amplification on rice gene expression.
Nature 461: 1130-1134.
31
Pi, W., X. Zhua, M. Wua, Y. Wanga, S. Fulzeleb, et al., 2010 Long-range function of an
intergenic retrotransposon. Proc. Natl. Acad. Sci. U.S.A. 107: 12992-12997.
Rozas, J., J. C. Sanchez-DelBarrio, X. Messeguer, and R. Rozas, 2003 DnaSP, DNA
polymorphism analyses by the coalescent and other methods. Bioinformatics 19: 2496-
2497.
Schmidt, J. M., R. T. Good, B. Appleton, J. Sherrard, G. C. Raymant, et al., 2010 Copy number
variation and transposable elements feature in recent, ongoing adaptation at the Cyp6g1
locus. PLOS Genet. 6: e10000998.
Sheen, J. 2002 “A transient expression assay using maize mesophyll protoplasts”
(http://genetics.mgh.harvard.edu/sheenweb).
Teacher, A. G. F. and D. J. Griffiths, 2010 HapStar: automated haplotype network layout and
visualization. Mol. Ecol. Resources 11: 151-153.
Tenaillon, M. I., M. C. Sawkins, A. D. Long, R. L. Gaut, and J. F. Doebley, et al., 2001 Patterns
of DNA sequence polymorphism along chromsome 1 of maize (Zea mays ssp. mays L).
Proc. Natl. Acad. Sci. U.S.A. 98: 9161-9166.
Thompson, J. D., D. G. Higgins, and T. J. Gibson, 1994 CLUSTAL W: improving the sensitivity
of progressive multiple sequence alignment through sequence weighting, position-
specific gap penalties and weight matrix choice. Nucl. Acids Res. 22: 4673-4680.
Thornton, K. 2003 libsequence: a C++ class library for evolutionary genetic analysis.
Bioinformatics 19: 2325-2327.
Torkamanzehi, A., C. Moran, and F. W. Nicholas, 1992 P element transposition contributes
substantial new variation for a quantitative trait in Drosophila melanogaster. Genetics
131: 73-78.
32
White, S. E., L. F. Habera, and S. R. Wessler, 1994 Retrotransposons in the flanking regions of
normal plant genes: A role for copia-like elements in the evolution of gene structure and
expression. Proc. Natl. Acad. Sci. U.S.A. 91: 11792-11796.
Xiao, H., N. Jiang, E. Schaffner, E. J. Stockinger, and E. van der Knaap, 2008 A retrotransposon-
mediated gene duplication underlies morphological variation of tomato fruit. Science
319: 1527-1530.
Zhao, Q., 2006 Molecular population genetics of maize regulatory genes during maize evolution.
Ph.D Thesis, University of Wisconsin-Madison, Madison, WI.
Zhao, Q., A. L. Weber, M. D. McMullen, K. Guill, and J. Doebley, 2010 MADS-box genes of
maize: frequent targets of selection during domestication. Genet. Res. Camb.
doi:10.1017/S0016672310000509.
33
Figure S1: tb1 locus teosinte recombinant chromosomes. The horizontal axis represents the tb1
genomic region to scale. Base-pair positions are relative to AGPv2 position 256,745,977 of the
maize reference genome sequence. The tb1 ORF as well as the nearest upstream predicted gene
(PG3) are shown. The previously defined control region (CR) is shown in red, and is divided into
its proximal and distal components. Thick black lines represent teosinte chromosome segments
and narrow lines represent maize chromosome segments. Introgression lines with blue
background shading were developed and described previously. Introgression lines with red
background shading were developed during the course of this study.
34
tb1PG3
(kb)
-160 -140 -120 -100 -80 -60 -40 -20 0
CR
35
Figure S2: Phenotypic additive effects for the smallest introgression segments.
Three introgression segments are compared. These introgression segments include the full
control region (I44), the proximal component only (I46), and the distal component only (I38).
The length of each introgressed segment is included for both the corresponding maize and
teosinte alleles. The size of the introgression segments vary between maize and teosinte because
of insertion/deletion polymorphisms. Additive effects for each of the introgression segments are
shown with standard errors. These effects highlight the independent phenotypic contribution of
both the proximal and distal components of the control region (CR).
36
0.0
-1.5
-1.0
-0.5
0.75
0.00
0.25
0.50
0.75
0.00
0.25
0.50
Tillering Internode Length Cupules per Rank
tb1PG3 CR Introgression Length (bp)
~27,931 ~30,620
10,617 8,383
5,428 532
37
Table S1: Primer sequences for genotyping.
Primer Namea
Primer Sequence (5' to 3')
GS1-F ACACCGCCACCGACATCT
GS1-R TTGTCCCTGAACGGCCAATA
GS2-F TGGCCAATAAATGTACTAGGTCAC
GS2-R TGATCATACCACCTCTCTATGCAG
GS3-F CATGAACATGCCGTGTGCT
GS3-R TTCTAGTACCTAGTGCGCCCGTAG
GS4-F AGTAGGCCATAGTACGTAC
GS4-R CTCTTTACCGAGCCCCTACA
GS5-F AGTGGACAACCGAACGAAGA
GS5-R GAAGCAACTATCAACACAAGCCTT
GS6-F TGTTGTTGGTGATGGAGTCG
GS6-R CGTGTGTGTGATCGAATGGT
GS7-F AGCCAGGATCAATGGCATAC
GS7-R AGCAAAGGGCATGTGTTACC
GS8-F GTTAACCATGAGACGGCCAC
GS8-R GTCAGAATCCCCTGCTCG
Primer Nameb
Primer Sequence (5' to 3')
FM-F0372 ACCAGCAAGCAGCAAGAAAT
IM-R0375 TTGAGTGTCGCCTAGACTGC
RM-R0377 CCTACTTTTTCATCTCCCGC
FH-F0378 CTGCGATGATGCAAGGAGTA
IH-R0379 CTCAATGCATGCCGTTATTG
FH-R0381 CGTTGTCGACAGTCTCCTCA
aPrimer sequences of markers used for genotyping and detecting new recombinant chromosomes.
bPrimer sequences used for detection of Tourist and Hopscotch elements at the tb1 locus.
38
Table S2: Germplasm assayed for transposon insertions.
Germplasm Type Racename Sourcea
Accession Touristb
Hopscotchb
Maize Landrace* Assiniboine NCRPIS PI213793 M M
Maize Landrace* Bolita INIFAP OAX68 M M
Maize Landrace* Cateto Sulino CIMMYT URG II M M
Maize Landrace* Chalqueno INIFAP MEX48 M M
Maize Landrace* Chapalote INIFAP SIN2 M M
Maize Landrace* Conico INIFAP PUE32 M M
Maize Landrace* Costeno ICA VEN453 M M
Maize Landrace* Cristalino Norteno NCGRP CHI349 M M
Maize Landrace* Dzit Bacal CIMMYT GUA131 M M
Maize Landrace* Gordo CIMMYT CHH160 M M
Maize Landrace* Guirua NCGRP MAG450 M M
Maize Landrace* Nal-tel INIFAP YUC7 M M
Maize Landrace* Pisccorunto PCIM APC13 M M
Maize Landrace* Sabanero NRC SAN329 M M
Maize Landrace* Serrano INIFAP GUA14 M T
Maize Landrace* Zapalote Chico CIMMY OAX70 M M
Inbred Teosinte Balsas JFD TIL01 T T
Inbred Teosinte Balsas JFD TIL02 T T
Inbred Teosinte Jalisco JFD TIL03 M M
Inbred Teosinte Balsas JFD TIL04 T T
Inbred Teosinte Balsas JFD TIL05 T T
Inbred Teosinte Balsas JFD TIL06 T T
Inbred Teosinte Balsas JFD TIL07 T T
Inbred Teosinte Balsas JFD TIL08 T T
Inbred Teosinte Balsas JFD TIL09 M M
Inbred Teosinte Balsas JFD TIL10 T T
Inbred Teosinte Jalisco JFD TIL11 T T
Inbred Teosinte Balsas JFD TIL12 T T
Inbred Teosinte Jalisco JFD TIL14 T T
Inbred Teosinte Balsas JFD TIL16 T T
Inbred Teosinte Balsas JFD TIL17 M M
Inbred Teosinte Chalco JFD TIL18 T T
Inbred Teosinte Central Plateau JFD TIL25 T T
Non-inbred Teosinte Balsas CIMMYT 8779 T T
Non-Inbred Teosinte Balsas INIFAP JSG Y LOS-113 T T
Non-Inbred Teosinte Jalisco INIFAP JSG Y SMH-355 M M
Non-Inbred Teosinte Balsas NCRPIS PI566688 T H
Non-Inbred Teosinte Balsas INIFAP JSG Y LOS-172 T T
39
Germplasm Type Racename Sourcea
Accession Touristb
Hopscotchb
Non-Inbred Teosinte Balsas INIFAP JSG-377 T T
Non-Inbred Teosinte Balsas INIFAP C-9-78 T T
Non-Inbred Teosinte Balsas CIMMYT 11401 T T
Non-Inbred Teosinte Balsas INIFAP JSG Y LOS-178 T T
Non-Inbred Teosinte Jalisco INIFAP JSG Y MAS-400 T T
Non-Inbred Teosinte Jalisco INIFAP JSG Y SMH-352 T T
Non-Inbred Teosinte Balsas HHI IC #3 T T
Non-Inbred Teosinte Balsas CIMMYT 11402 T T
Non-Inbred Teosinte Balsas INIFAP JSG Y LOS-130 T T
Non-Inbred Teosinte Balsas INIFAP JSG-379 T T
Non-Inbred Teosinte Balsas INIFAP C-17-78 T T
Non-Inbred Teosinte Balsas INIFAP C-14-78 T T
Non-Inbred Teosinte Balsas CIMMYT 8760 T T
Non-Inbred Teosinte Balsas INIFAP JSG-378 T T
Non-Inbred Teosinte Balsas CIMMYT 8763 T T
Non-Inbred Teosinte Balsas GWB BK Site 4 T T
Non-Inbred Teosinte Balsas CIMMYT 11406 T T
Non-Inbred Teosinte Balsas CIMMYT 11353 T T
Non-Inbred Teosinte Balsas INIFAP JSG Y LOS-121 T T
Non-Inbred Teosinte Balsas CIMMYT 8784 T T
Non-Inbred Teosinte Balsas INIFAP JSG Y LOS-109 T T
Non-Inbred Teosinte Jalisco INIFAP JSG-203 T T
Non-Inbred Teosinte Jalisco BFB 967 T T
Non-Inbred Teosinte Balsas INIFAP JSG-387 T T
Non-Inbred Teosinte Balsas INIFAP JSG-193 T T
Non-Inbred Teosinte Balsas CIMMYT 11357 T T
Non-Inbred Teosinte Balsas CIMMYT 8762 T T
Non-Inbred Teosinte Balsas CIMMYT 8783 T T
Non-Inbred Teosinte Balsas INIFAP JSG-192 T T
Non-Inbred Teosinte Balsas INIFAP JSG-374 T T
Non-Inbred Teosinte Balsas INIFAP JSG Y LOS-159 T T
Non-Inbred Teosinte Balsas CIMMYT 11404 T T
Non-Inbred Teosinte Jalisco INIFAP JSG Y MAS-264 T T
Non-Inbred Teosinte Balsas INIFAP JSG-385 T T
Non-Inbred Teosinte Jalisco INIFAP JSG Y MAS-402 T T
Non-Inbred Teosinte Jalisco INIFAP JSG Y LOS-43 T T
Non-Inbred Teosinte Balsas INIFAP JSG Y LOS-161 T T
Non-Inbred Teosinte Jalisco INIFAP MAS-15 T T
Non-Inbred Teosinte Balsas CIMMYT 11403 T T
Non-Inbred Teosinte Balsas CIMMYT 8776 T T
40
Germplasm Type Racename Sourcea
Accession Touristb
Hopscotchb
Non-Inbred Teosinte Balsas CIMMYT 11361 T H
Non-Inbred Teosinte Balsas INIFAP JSG-197 T T
Non-Inbred Teosinte Balsas INIFAP JSG Y LOS-120 T T
Non-Inbred Teosinte Balsas INIFAP JSG-191 T T
Non-Inbred Teosinte Balsas NCRPIS PI566686 T T
Non-Inbred Teosinte Jalisco INIFAP JSG Y LOS-142 T T
Non-Inbred Teosinte Balsas CIMMYT 11355 T H
Non-Inbred Teosinte Balsas INIFAP JSG Y LOS-119 T T
Non-Inbred Teosinte Balsas CIMMYT 8767 M M
Non-Inbred Teosinte Balsas INIFAP JSG Y LOS-176 T T
Non-Inbred Teosinte Jalisco INIFAP JSG Y LOS-40 T T
Non-Inbred Teosinte Balsas INIFAP JSG-382 T T
Non-Inbred Teosinte Jalisco CIMMYT 9477 T T
Non-Inbred Teosinte Balsas CIMMYT 11388 T T
Non-Inbred Teosinte Balsas INIFAP JSG-187 T T
Non-Inbred Teosinte Balsas CIMMYT 8759 T T
Non-Inbred Teosinte Balsas CIMMYT 8782 T T
Non-Inbred Teosinte Balsas CIMMYT 11376 T T
Non-Inbred Teosinte Balsas INIFAP JSG-391 T T
Non-Inbred Teosinte Balsas CIMMYT 8766 T T
Non-Inbred Teosinte Balsas NCRPIS PI566691 T T
Non-Inbred Teosinte Balsas CIMMYT 8765 T T
Non-Inbred Teosinte Jalisco INIFAP JSG Y LOS-74 T T
Non-Inbred Teosinte Balsas NCRPIS PI384064 T T
Non-Inbred Teosinte Jalisco INIFAP JSG Y MAS-401 T T
Non-Inbred Teosinte Balsas INIFAP JSG Y LOS-126 T T
Non-Inbred Teosinte Balsas CIMMYT 8761 T T
Non-Inbred Teosinte Balsas CIMMYT 11407 T T
Non-Inbred Teosinte Balsas CIMMYT 8758 T T
Inbred Teosinte Balsas JFD TIL15 T T
Maize Landrace Arrocillo Amarillo INIFAP VER 311 M M
Maize Landrace Aysuma ICA BOV 331 M M
Maize Landrace Canilla ICA VEN 604 M M
Maize Landrace Capia Blanco CIMMYT ARG 499 M M
Maize Landrace Cariaco NRC COR 334 M M
Maize Landrace Cateto Nortista Precoce CIMMYT SUR I M M
Maize Landrace Cateto Sulino CIMMYT URG II M M
Maize Landrace Chalqueno INIFAP MEX 48 M M
Maize Landrace Chillo NRC ECU 458 M M
Maize Landrace Chococeno ICA ECU 964 M M
41
Germplasm Type Racename Sourcea
Accession Touristb
Hopscotchb
Maize Landrace Conejo INIFAP GRO 157 M M
Maize Landrace Confite Puneno PCIM PUN 4 M M
Maize Landrace Conico INIFAP PUE 109 M M
Maize Landrace Conico INIFAP MEX 108 M M
Maize Landrace Coroico ICA BOV 992 M M
Maize Landrace Cuban Flint CIMMYT CUB 63 M M
Maize Landrace Dente Paulista CIMMYT SP III M M
Maize Landrace Dulce de Jalisco INIFAP ZAC 182 M M
Maize Landrace Dzit Bacal CIMMYT GUA 130 M M
Maize Landrace Dzit Bacal INIFAP QOO 20 M M
Maize Landrace Guirua NCGRP MAG 443 M M
Maize Landrace Harinoso Tarapaqueno NRC CHI 421 M M
Maize Landrace Imbricado ICA CUN 372 M M
Maize Landrace Jala CIMMYT JAL 44 M M
Maize Landrace Karapampa NRC BOV 961 M M
Maize Landrace Kculli PCIM CUZ 66 M M
Maize Landrace Montana ICA NAR 426 M M
Maize Landrace morado ICA BOV 567 M M
Maize Landrace motozinteco INIFAP CHS 650 M M
Maize Landrace mushito INIFAP MIC 328 M M
Maize Landrace Nal-tel INIFAP CAM 48 M M
Maize Landrace Nal-tel INIFAP YUC 148 M H
Maize Landrace Nal-tel de Altura INIFAP CHS 196 M M
Maize Landrace Onaveno INIFAP SON 105 H M
Maize Landrace Pepitilla INIFAP MOR 99 M M
Maize Landrace Pira ICA VEN 485 M M
Maize Landrace Pisccorunto PCIM APC 13 M M
Maize Landrace Sabanero NRC SAN 329 M M
Maize Landrace Serrano INIFAP GUA 14 M H
Maize Landrace Serrano Mixe INIFAP OAX 565 T M
Maize Landrace Tablilla de Ocho CIMMYT NAY 185 M M
Maize Landrace Tuson INIFAP TRN 10 M M
Maize Landrace Tuxpeno Norteno INIFAP COA 21 M M
Maize Landrace Tuxpeno Norteno INIFAP CHH 121 M M
Inbred Landrace Araguito NCRPIS MR01 M M
Inbred Landrace Assiniboine NCRPIS MR02 T M
Inbred Landrace Bolita NCRPIS MR03 M M
Inbred Landrace Cateto NCRPIS MR05 M M
Inbred Landrace Chapalote NCRPIS MR06 M M
Inbred Landrace Comiteco NCRPIS MR07 M M
42
Germplasm Type Racename Sourcea
Accession Touristb
Hopscotchb
Inbred Landrace Costeno NCRPIS MR08 M M
Inbred Landrace Cravo Riogranense NCRPIS MR09 M M
Inbred Landrace Cristalino Norteno NCRPIS MR10 M M
Inbred Landrace Cuban Flint NCRPIS MR11 M M
Inbred Landrace Havasupai NCRPIS MR12 M M
Inbred Landrace Hickory King NCRPIS MR13 M M
Inbred Landrace Longfellow Flint NCRPIS MR14 M M
Inbred Landrace Pisankalla NCRPIS MR17 M M
Inbred Landrace Reventador NCRPIS MR18 M M
Inbred Landrace Santa Domingo NCRPIS MR19 T M
Inbred Landrace Shoe Peg NCRPIS MR20 M M
Inbred Landrace Tabloncillo NCRPIS MR21 M M
Inbred Landrace Tuxpeno NCRPIS MR22 M M
Inbred Landrace Zapalote Chico NCRPIS MR23 M M
Inbred Landrace Chullpi NCRPIS MR24 M M
Inbred Landrace Pororo NCRPIS MR25 M M
Inbred Landrace Pollo NCRPIS MR26 M M
Maize Inbred NCRPIS B73 M M
Maize Inbred NCRPIS Mo17 M M
Maize Inbred NCRPIS B97 M M
Maize Inbred NCRPIS CML52 M M
Maize Inbred NCRPIS CML69 M M
Maize Inbred NCRPIS CML103 M M
Maize Inbred NCRPIS CML228 M M
Maize Inbred NCRPIS CML247 M M
Maize Inbred NCRPIS CML277 M M
Maize Inbred NCRPIS CML322 M M
Maize Inbred NCRPIS CML333 M M
Maize Inbred NCRPIS Hp301 M M
Maize Inbred NCRPIS Il14H M M
Maize Inbred NCRPIS Ki3 M M
Maize Inbred NCRPIS Ki11 M M
Maize Inbred NCRPIS Ky21 M M
Maize Inbred NCRPIS M37W M M
Maize Inbred NCRPIS M162W M M
Maize Inbred NCRPIS MO18W M M
Maize Inbred NCRPIS MS71 M M
Maize Inbred NCRPIS NC350 M M
Maize Inbred NCRPIS NC358 M M
Maize Inbred NCRPIS Oh7B M M
43
Germplasm Type Racename Sourcea
Accession Touristb
Hopscotchb
Maize Inbred NCRPIS Oh43 M M
Maize Inbred NCRPIS P39 M M
Maize Inbred NCRPIS Tx303 M M
Maize Inbred NCRPIS Tzi8 M M
Maize Inbred NCRPIS W22 M M
*Maize landrace made haploid for DNA extraction and sequencing.
a Source information can be found at www.panzea.org.
b"M" denotes the common maize haplotype (presence of the element), "T" denotes the common teosinte haplotype
(absence of the element).
44
Table S3: Input values for HKA tests.
Loci a
Sb
Lc
Nd
Ke
AY104395 10 477 11 4.909
AY106816 39 532 13 37.615
AY107192 13 495 14 6.286
AY107248 21 538 14 7.286
AY111546 4 674 15 3.000
AY111711 26 528 15 13.200
Segment A 40 740 16 15.688
Segment B 89 1086 16 80.938
Segment C 20 1723 15 35.267
Segment D 9 1392 16 47.313
aSix neutral loci and four segments tested in this paper (Fig. 3a).
bNumber of segregating sites.
cNumber of total sites excluding gaps.
dSample size.
eAverage nucleotide difference.
45
Chapter 2
Do large effect QTLs fractionate?
A case study at the maize domestication QTL teosinte branched1
46
Abstract
Quantitative trait loci (QTL) mapping is a valuable tool for studying the genetic architecture of
trait variation. Despite the large number of QTL studies reported in the literature, the identified
QTL are rarely mapped to the underlying genes and it is usually unclear whether a QTL
corresponds to one or multiple linked genes. Similarly, when QTL for several traits co-localize,
it is usually unclear whether this is due to the pleiotropic action of a single gene or multiple
linked genes, each affecting one trait. The domestication gene teosinte branched1 (tb1) was
previously identified as a major domestication QTL with large effects on the differences in plant
and ear architecture between maize and teosinte. Here we present the results of two experiments
that were performed to determine if the single gene tb1 explains all trait variation for its genomic
region or if the domestication QTL at tb1 fractionates into multiple linked QTL. For traits
measuring plant architecture, we detected only one QTL per trait and these QTL all mapped to
tb1. These results indicate that tb1 is the sole gene for plant architecture traits that segregates in
our QTL mapping populations. For most traits related to ear morphology, we detected multiple
QTL per trait in the tb1 genomic region including a large effect QTL at tb1 itself plus one or two
additional linked QTL. tb1 is epistatic to two of these additional QTL for ear traits. Overall,
these results provide examples for both a major QTL that maps to a single gene, as well as a case
in which a QTL fractionates into multiple linked QTL.
47
Introduction
Quantitative trait loci (QTL) mapping studies have become widely used to elucidate the genetic
architecture of trait variation in many organisms (Mackay et al. 2009). A common observation in
these studies is that QTL of large effect are often detected. Noor et al. (2001) have questioned
whether such large effect QTL represent single genes or groups of tightly linked genes. These
authors have suggested that such large effect QTL, upon closer examination, might fractionate
into multiple linked small effect QTL, representing multiple genes. A single QTL does not
necessarily equal a single gene. Even in cases where QTL effects have been fine mapped to a
specific gene, the research may not formally exclude the presence of additional linked genes that
contribute to the overall QTL effect for that genomic region.
Doebley and Stec (1991, 1993) identified a QTL of large effect on the long arm of maize
chromosome 1 controlling the differences in plant and ear architecture between maize and
teosinte. These authors proposed that tb1, a known mutant of maize, was the gene underlying this
QTL because tb1 falls within the 1 LOD support interval for the QTL, and because the tb1
mutant and the QTL affect the same suite of traits. Subsequently, Doebley et al. (1995) used a
complementation test that showed that the teosinte allele of the QTL fails to complement the tb1
mutant of maize, indicating that they are alleles of the same gene. However, complementation
tests do not provide formal proof because of the potential for non-allelic non-complementation.
Additional support for the hypothesis that tb1 is the gene underlying the major
domestication QTL was obtained after the cloning of tb1 (Doebley et al. 1997). With a
knowledge of the molecular identity of tb1, Doebley et al. (1997) showed that the maize allele of
this gene is expressed at twice the level of the teosinte allele in the developing branch and in
48
immature ears. Thus, a change in gene regulation was hypothesized to be the causative difference
between maize and teosinte. Finally, Clark et al. (2006) provided formal proof that tb1 is the
QTL by fine-mapping the QTL to a 12 kb “control region” located ~58-69 kb upstream of the
tb1 open reading frame. These authors further demonstrated that this control region contains a
factor that acts as a cis-regulatory element with the maize allele conditioning a higher level of
tb1 expression than the teosinte allele. However, their data does not address the possibility of
additional QTL linked to tb1, and indeed some of their data suggest that such additional linked
QTL may exist, i.e. that tb1 fractionates into multiple linked QTL.
In this paper, we report two experiments performed to address whether there are
additional QTL closely linked to tb1. In Experiment I, we analyzed a mapping population in
which the tb1 control region identified by Clark et al. (2006) is fixed for the teosinte haplotype,
but the regions flanking it are segregating for maize vs. teosinte chromosomal segments. If there
are additional QTL linked to the control region, then there should be phenotypic effects
associated with the segregating maize vs. teosinte chromosomal segments despite the fact that
the tb1 control region is not segregating. Conversely, if the control region alone explains all
phenotypic effects, then there should be no phenotypic effects associated with the flanking
chromosomal regions. In Experiment II, we analyzed a set of nearly isogenic recombinant inbred
lines (NIRILs) for the tb1 genomic region to see if we could detect any QTL other than tb1. This
experiment has more power than a standard QTL analysis to detect closely linked QTL because
the NIRILs have an isogenic background and the NIRILs were grown in replicate to obtain better
estimates of QTL effects.
Based on these two experiments, we confirm that tb1 is a large effect QTL contributing to
the differences in plant and ear architecture between maize and teosinte. In fact, tb1 is the only
49
QTL for plant architecture traits that we detected. However, we identify four additional QTL
affecting ear architecture. One of these additional QTL is located only 6 cM upstream of tb1.
Two of these additional QTL have significant epistatic interactions with tb1. Thus, our results
provide examples for both a major QTL that maps to a single gene as shown for plant
architecture, as well as a case in which a QTL fractionates into multiple QTL as shown for ear
architecture.
Materials and Methods
Plant materials
Segments of the long arm of chromosome 1 from teosinte were introgressed into a maize inbred
W22 background for both Experiments I and II. For Experiment I, a segment of the long arm of
chromosome 1 from a teosinte (Zea mays ssp. mexicana; collection Wilkes-Panindicuaro) was
introgressed into W22 via six generations of backcrossing (Figure 1). A BC6S1 line (I01) that
was homozygous for the teosinte alleles at markers bnlg615 and bnlg1671, which flank tb1, was
recovered. I01 was then crossed to W22 and the F2 progeny of this cross were screened for
crossovers near tb1. A plant with one of the newly identified recombinants was itself crossed to
W22, and the F2 progeny of this cross were screened for crossovers near tb1. From this process, a
homozygous introgression line (I16) containing an ~69 kb segment of teosinte chromosome
which encompasses the tb1 upstream control region and part of the ORF was recovered (Clark et
al. 2006). Homozygous I01 and I16 lines were crossed and the resulting F1 plants were selfed to
produce an F2 population for Experiment I.
50
Figure 1: Map of the introgression lines used. All introgressed segments are drawn to scale.
Black shaded areas indicate teosinte chromosome segments, unshaded areas represent maize
chromosome segments. Markers flanking the introgressions and the position of tb1 are shown for
reference. The introgressed segment in I16 is only ~69 kb.
51
52
For Experiment II, a segment of the long arm of chromosome 1 (T1L) from a teosinte
(Zea mays ssp. parviglumis; Iltis and Cochrane collection 81) was introgressed into W22 via six
generations of backcrossing (Figure 1). During the backcrossing process, molecular markers
were used both to follow the target segment surrounding the QTL on the long arm of
chromosome 1, as well as to eliminate teosinte chromosome segments at other major
domestication QTL identified by Doebley and Stec (1993) (Table S1). Six separate BC6 plants
heterozygous for the target segment were selfed to give six BC6S1 families (designated families
A-E). These six families were selfed an additional five generations to produce a set of 153
homozygous nearly isogenic recombinant inbred lines (NIRILs). These 153 lines were
distributed among the six families as follows: A: 24, B: 31, C: 39, D: 25, E: 19, F: 15. These
lines possess a set of maize-teosinte recombinant chromosomes for the tb1 genomic region in the
W22 genetic background. These 153 lines make up the QTL mapping population of Experiment
II.
Molecular markers and linkage map
Plants in Experiment I were genotyped using a PCR-based indel marker, GS3, previously
described by Clark et al. (2006). GS3 is located in the coding region of tb1 and segregates in the
I01 I16 F2 population. Plants in Experiment II were genotyped using a set of 25 PCR-based
markers: 16 SSRs, 6 insertion or deletion (indel), and 3 markers scored for the presence/absence
of a PCR product. Marker information is available at either Panzea (www.panzea.org) or
MaizeGDB (www.maizegdb.org). There were a total 174 crossovers among the 153 lines,
averaging 1.1 crossovers per line. The distribution of crossovers among lines was as follows: 0
(46 lines), 1 (52 lines), 2 (44 lines), 3 (10 lines) and 4 (1 line). A genetic map was constructed
53
using the Kosambi map function and a genotyping error rate of 0.0001 as parameter values for
the “est.map” command in the R/qtl module of the R statistical computing package (Broman et
al. 2003).
Phenotypic data collection
The plants for Experiment I were grown at the University of Wisconsin West Madison
Agricultural Research Station, Madison, WI, USA during summer 2006. F2 seed from three ears
(A, B, and C) generated by three separate I01 I16 crosses was planted in a randomized
complete block design using a grid with 0.9 meter spacing between plants in both dimensions.
This spacing minimized the degree to which plants shaded their neighbors. The following five
traits were phenotyped for Experiment I: cupules per rank (CUPR; number of cupules in a single
rank from base to the tip of the ear), ear diameter (ED; diameter, in mm, of the midsection of
each ear), lateral branch internode length (LBIL; mean internode length, in cm, of the uppermost
lateral branch), tillering (TILL; the ratio of the sum of tiller heights/plant height), and tiller
number (TILN; the number of tillers per plant). CUPR and ED were both measured on the
uppermost, well-formed lateral inflorescence (ear) of each plant.
The NIRILs for Experiment II, along with the backcross parent W22, were grown using a
randomized complete block design at the University of Wisconsin West Madison Agricultural
Research Station, Madison, WI, USA during summer 2008. The design included three replicates
(blocks A, B, and C) with a single 10-plant plot of each NIRIL per replicate. Each plot was 3.7 m
long and 0.9 m wide. The plots within each block were arranged in a grid with row and column
designations so that position effects could be included during data analysis. Three plants were
phenotyped per plot. In addition to the five traits measured in Experiment I, the following three
54
traits were evaluated: 10-kernel length (10KL; length, in mm, of 10 consecutive kernels in a
single rank along the ear), ear length (EL; distance, in cm, from the base to the tip of the ear),
and percent staminate spikelets (STAM; percentage of male spikelets in the inflorescence).
10KL, CUPR, ED, EL, and STAM were all measured on the uppermost, well-formed lateral
inflorescence (ear) of each plant.
Data analysis
For Experiment I, we used the GLM procedure of SAS (Littel et al. 1996) to compare the effects
of the I01 and I16 introgression segments on phenotypes. Genotype (homozygous I01,
homozygous I16, or heterozygous) and ear parent (A, B or C) were considered as fixed effects.
The general linear model used was
Yijk = µ + ai + bj + eijk
where Yijk is the trait value for the kth
plant from the jth
ear parent with ith
genotype, is the
overall mean of the experiment, ai is the genotype effect, bj is the ear parent effect, and eijk is the
sampling error. Using this model, the effects of the different introgressions (I01 vs. I16) were
evaluated.
For Experiment II, we obtained least-squares means for each NIRIL using the MIXED
procedure of SAS (Littel et al. 1996). The NIRIL (or parental) lines and families (A-E) were
considered fixed effects while blocks (A, B, and C) and plot coordinates were treated as random
effects. The linear model used was
55
Yhijklm = + ah(bi) + bi + cj + dk + fl + ehijkl + ghijklm
where Yhijklm is the trait value for the mth
plant at lth
column and kth
row in the jth
block of the hth
NIRIL nested in the ith
family, is the overall mean of the experiment, ah is the NIRIL (or
parental) line effect, bi is the family effect, cj is the block effect, dk is the row effect, fl is the
column effect, and ehijkl is the experimental error (random variation among plots), and ghijklm is
the (within-plot) sampling error. All fixed effects were significant and were included in the
model for the calculation of the least-squares means. The random effects of this full model were
subjected to the Likelihood Ratio Test for significance for each trait. Effects that were not
significant were dropped from the model on a trait by trait basis.
The least-squares means estimates were used for QTL mapping in Experiment II, which
was conducted in the R/qtl module of the R statistical computing package (Broman et al. 2003).
For each trait, an initial QTL scan was performed using simple interval mapping with a 0.25 cM
step (Lander and Botstein 1989) and the position of the highest LOD score was recorded.
Statistical significance of the peak LOD score was assessed using 10,000 permutations of the
data (Doerge and Churchill 1996). Then, the position and effect of the QTL was refined using the
Haley-Knott Regression method (Haley and Knott 1992) by executing the “calc.genoprob”
command (0.25 cM step size and assumed genotyping error rate of 0.001), followed by the
“fitqtl” command. To search for additional QTL, the “addqtl” command was used. If a second
QTL was detected, then “fitqtl” was used to test a model containing both QTL and their
interaction effect. If both QTL remained significant, the “refineqtl” command was used to re-
estimate the QTL positions based on the full model including both QTL. Finally, each QTL was
56
removed from the model and then added back using the “addqtl” command to re-confirm its
significance and position. Approximate confidence intervals for the locations of the QTL were
obtained via 1.5 LOD support intervals to each side of the position of the LOD maximum.
We calculated broadsense heritabilities (H2) for Experiment II on a line mean-basis
rH
egeg
g
/)( 222
2
2
where 2
g is the genotypic variance,
2
ge is the genotype environment interaction variance, and
2
e is the experimental error variance divided by the number of replicates (r = 9). We used the
MIXED procedure of SAS to fit a linear random-effect model for the estimation of the variance
components (Littel et al. 1996). All data for both Experiments 1 and 2 are available at
www.panzea.org.
Results
Experiment I: To test whether the tb1 control region identified by Clark et al. (2006) is
sufficient to explain all of the phenotypic effects observed when a teosinte segment of the long
arm of chromosome 1 is introgressed into W22, we analyzed an F2 family from an I01 (full
introgression segment) I16 (tb1 control region only) cross (Figure 1). A general linear model
was used to compare the effects among the genotypic classes in this family (Table 1). For plant
57
Table 1: Experiment I results: the comparison of introgressed segment I01 to I16
Trait Additive Effects P-Value Units
CUPR -1.1044 <0.0001 Count
ED -1.7403 <0.0001 mm
LBIL -0.0620 0.4600 cm
TILL 0.0042 0.9344 Ratio
TILN -0.0612 0.2402 Count
58
architecture (branching) traits (LBIL, TILL, and TILN), we could not reject the null hypothesis
that I01 and I16 have equal effects on phenotype, indicating that there are no additional QTL for
these traits beyond the control region identified by Clark et al. (2006). However, for ear
morphology traits (CUPR and ED), I01 and I16 have significantly different phenotypic effects.
Thus, there must be QTL in addition to the tb1 control region for ear traits.
Experiment II: Quantitative trait variation. Given that Experiment I indicated that there are one
or more ear trait QTL linked to the tb1 control region, we attempted to map these QTL using a
set of 153 NIRILs in Experiment II. These lines were grown in a randomized block design with
three blocks and one plot of each line per block. The least-squares means for each trait for each
NIRIL were estimated using a mixed linear model. The heritabilities of the traits are generally
high with all values being greater than 0.7 (Figure 2).
Histograms of the trait distributions show a large degree of separation between the
phenotypic means for the two parental lines for all traits except EL (Figure 2). For example, for
TILN, the maize parent has a mean value of approximately 0.5 tillers, while the teosinte parent
has a value of 2 tillers. For all traits, the mean values for the maize parental line was located
toward the edge of the trait distributions representing more maize-like phenotypes, while the
mean values for the teosinte parental line was associated with more teosinte-like phenotypes. The
trait distributions tend to be somewhat bimodal and/or skewed (Figure 2). 10KL has a distinctly
bimodal distribution while other traits are more weakly bimodal (CUPR, LBIL, and TILL). For
traits with a bimodal distribution, the means for the maize and teosinte parental lines are each
located at one of the two modes of the distribution. In all cases the trait distributions are skewed
toward teosinte-like phenotypic values. This skew toward teosinte-like phenotypes occurs due to
the excess of NIRILs with the maize genotype in the NIRIL population. For example, 44 NIRILs
59
Figure 2: Frequency distribution of nearly isogenic recombinant inbred lines (NIRILs) least-
squares means of the eight traits measured in this study. The arrows and black bars indicate the
bin containing the parental lines: maize inbred W22 (W) and introgression line W22-T1L (T).
Heritabilities were calculated on a plot-bases for each trait. Traits are abbreviated as follows: 10-
kernel length (10KL, in mm), cupules per rank (CUPR), ear diameter (ED, in cm), ear length
(EL, in cm), lateral branch internode length (LBIL, in cm), staminate spikelets (STAM, percent),
tillering (TILL), and tiller number (TILN).
60
15
10
5
0 1 2 3Tillering (ratio)
TILL
H2 = 0.7615
10
5
0 1 2 3Tiller number (count)
TILN
H2 = 0.74
6 8 10 12
20
15
10
5
Lateral branchinternode length (cm)
LBIL
H2 = 0.91
0 10 20 30 40 50
150
100
50
Percent staminatespikelets
STAM
H2 = 0.84
30
20
10
12 14 16Ear length (cm)
EL
H2 = 0.73
30
20
10
30 35 40Ear diameter (mm)
ED
H2 = 0.94
35
25
15
40 50 60Length of ten kernels (mm)
5
10KL
H2 = 0.95
25
15
5
25 30 35Cupules per rank (count)
CUPR
H2 = 0.83
61
were recovered that genotyped maize at all 25 markers, whereas only two NIRILs were
recovered that genotyped teosinte throughout the region.
Ear length (EL) is the one trait for which the maize and teosinte parental lines are the
least differentiated (Figure 2). This is because ear length is a composite of two other traits – the
number of kernels or cupules along the length of the ear (CUPR) and the length of each cupule or
kernel (10KL). For CUPR, the maize parent line has a larger number of cupules (kernels) than
the teosinte parent line, contributing to a longer ear relative to teosinte. However for 10KL, the
maize parent line has less elongated cupules, giving it a shorter ear relative to teosinte. Thus,
overall the maize and teosinte parent lines have ears that are roughly equivalent in length but
with different underlying morphological bases.
QTL mapping. We identified 12 QTL for the eight traits in the 63.1 cM region on
chromosome 1 (Figure 3, Table 2). The LOD thresholds (P=0.01) for QTL detection were
between 2.42 and 2.58, depending on the trait. All 12 QTL have associated LOD scores of 5.5 or
greater, thus they have strong statistical support. For five of the eight traits (EL, LBIL, STAM,
TILL, and TILN), a single QTL was detected, while for three traits (10KL, CUPR, and ED) two
or more QTL were detected. Significant interaction effects were also detected for the two QTL
controlling 10KL and ED. The R2 values for the genetic models for the traits range from 0.15 to
0.88. In most cases, the model R2 values correspond closely to the H
2 values. For example, R
2 vs.
H2 are 0.88 vs. 0.95 for 10KL, 0.69 vs. 0.83 for CUPR, 0.73 vs. 0.94 for ED, and 0.70 vs. 0.76
for TILL. This correspondence indicates that the detected QTL and interactions explain all or
most of the heritable variation among the NIRILs.
Single QTL were identified for five of the eight traits analyzed: EL, LBIL, STAM, TILL,
and TILN. Four of the five single QTL (lbil1.1, stam1.1, till1.1, tiln1.1) have 1.5 LOD intervals
62
Figure 3: Map of the 12 QTL detected in this study on chromosome arm 1L. Horizontal bars for
each QTL represent the 1.5 LOD support interval and the narrow vertical line marks the position
of the peak LOD score: black-bars indicate additive QTL and grey-bars indicate QTL with
interactions. The red line marks the position of tb1. QTL names are based upon the trait name
abbreviations followed by the chromosome number; the numbers after the period enumerate the
QTLs detected for each trait. Traits are abbreviated as follows: 10-kernel length (10KL, in mm),
cupules per rank (CUPR), ear diameter (ED, in cm), ear length (EL, in cm), lateral branch
internode length (LBIL, in cm), staminate spikelets (STAM, percent), tillering (TILL), and tiller
number (TILN). The genetic map below the QTL plot indicates the extent of the introgressed
W22-T1L segment. The position of each marker locus is shown in cM.
63
64
Table 2: Experiment II results: QTL summary data
QTL Position
(cM)
LOD
Score
CI (cM) Additive
Effect
Units LOD
Cutoff
R2 H2
10kl1.1 40.25 24.9 39.00 - 41.25 3.5 mm 13.5
10kl1.2 46.25 26.1 45.75 - 46.75 6.0 mm 14.5
10kl1.1:2 9.8 6.6 mm 4.2
10KL_Model 70.1 16.1 mm 2.58 87.9 0.95
cupr1.1 29.00 6.5 22.50 - 33.25 -1.7 Count 6.8
cupr1.2 46.50 26.3 45.25 - 49.25 -3.5 Count 38.1
CUPR_Model 38.4 -5.2 Count 2.51 68.5 0.83
ed1.1 8.50 9.3 0.00 - 11.75 -1.7 mm 8.8
ed1.2 31.00 10.1 28.00 - 34.25 -1.4 mm 9.7
ed1.3 45.00 22.5 44.25 - 46.75 -2.7 mm 26.5
ED1.2:3 5.2 -2.4 mm 4.6
ED_Model 43.1 -8.2 mm 2.45 72.7 0.94
el1.1 23.75 5.5 21.75 - 28.00 -0.8 cm 2.45 15.3 0.73
lbil1.1 45.00 36.5 44.25 - 45.50 2.4 cm 2.42 66.7 0.91
stam1.1 39.50 13.6 37.25 - 45.50 10 Percent 2.45 31.5 0.84
till1.1 45.25 40.2 44.50 - 46.25 1.3 Ratio 2.56 70.2 0.76
tiln1.1 45.50 37.1 44.50 - 46.50 1.2 Count 2.42 67.2 0.74
65
that include tb1 (Figure 3). The QTL for EL (el1.1) is located 21 cM upstream of tb1and its 1.5
LOD support interval is well separated from tb1. The single QTL at tb1 for LBIL, TILL, and
TILN all have relatively large effects with large R2 and H
2 values (Table 2, Figure 3). The
position and effect of lbil1.1, till1.1, and tiln1.1, suggest that tb1 explains all or most of the
genetic variation for these traits. These three large effect single QTL over tb1 all pertain to plant
architecture traits. The remaining five traits are not governed by a single QTL at tb1 and these
five traits are all ear traits. These results indicate that large effect QTL at tb1 accounts for all of
the variation for plant traits among these lines, although ear traits have a more complex genetic
architecture. These results are consistent with the results of Experiment I.
Multiple QTL were identified for 10KL, CUPR, and ED (Table 2, Figure 3). In all cases,
the multiple QTL for a single trait act in the same direction with the maize alleles contributing to
a maize-like phenotype and the teosinte alleles to a teosinte-like phenotype. For two traits (10KL
and ED), significant interaction effects were identified between QTL. For all traits with multiple
QTL, the QTL with the largest LOD score for each trait had a 1.5 LOD interval that includes, or
is less than 1 cM away from, tb1. For example, ed1.3, which falls directly over tb1, has a large
LOD score (22.5), while the other two QTL for ED have much smaller LOD scores (10.1 and
9.3). Thus, these data suggest that tb1 is the major QTL for 10KL, CUPR, and ED, even if there
are other QTL within the introgressed segment.
Refining QTL positions. For 10KL and CUPR, the largest effect QTL falls near tb1 but
tb1 lies outside the 1.5 LOD support interval. Since there are two QTL for each of these traits,
we reassessed whether the presence of multiple QTL was biasing the estimates of the QTL
positions. We subdivided the dataset to fix one of the QTL for a single genotype (maize or
teosinte) and then scanned the segregating region that remained for QTL. By scanning for QTL
66
with these subsets of the data, we can reevaluate whether there are two QTL in the positions
indicated by our initial analysis.
For 10kl1.1, two subsets of the data were analyzed: lines fixed for the maize allele of
10kl1.2, and lines fixed for the teosinte allele of 10kl1.2. When 10kl1.2 is fixed for the teosinte
allele, 10kl1.1 is still detected and in the same position (Table 3). When 10kl1.2 is fixed for the
maize allele, 10kl1.1 is not detected. This result was not unexpected because of the large
interaction term between 10kl1.1 and 10kl1.2. These results indicate that 10kl1.1 is real but only
has an effect on phenotype when the teosinte allele is present at 10kl1.2 due to an epistatic
interaction. Thus, 10kl1.2 is epistatic to 10kl1.1. This epistatic interaction is plainly visible when
the mean 10KL values for the different two-locus genotypic classes are compared (Figure 4A).
For 10kl1.2, two subsets of the data were analyzed: lines fixed for the maize allele at
10kl1.1, and lines fixed for the teosinte allele at 10kl1.1. When 10kl1.1 is fixed for the teosinte
allele, 10kl1.2 is still detected and in the same position (Table 3). When 10kl1.1 is fixed for the
maize allele, 10kl1.2 is still detected but it is shifted in position to fall over tb1. Thus, the
presence of 10kl1.2 is confirmed, and this analysis shows that the effects are independent of the
allelic composition at 10kl1.1. However, the conflicting results on its position indicate some
uncertainty about its exact location. One possibility is that 10kl1.2 is located at tb1.
For cupr1.1, two subsets of the data were analyzed: lines fixed for the maize allele of
cupr1.2, and lines fixed for the teosinte allele of cupr1.2. When cupr1.2 is fixed for the teosinte
allele, cupr1.1 is still detected but it is shifted in position to be over ed1.2 (Table 3). When
cupr1.2 is fixed for the maize allele, cupr1.1 is not detected. This result was not surprising
because an interaction term between cupr1.1 and cupr1.2 was nearly significant in the original
analysis (P-value of 0.0106 with a 0.01 cutoff). These results indicate that cupr1.1 is real but
67
Table 3: Experiment II reanalysis: Refined QTL positions
QTL Fixed Markera
Refined
Position (cM)
Original
Position (cM)
LOD
Scoreb
Refined CI
(cM)
Original CI
(cM)
10kl1.1 PZD00119-T 40.38 40.25 10.0 38.64 - 41.48 39.00 - 41.25
10kl1.1 PZD00119-M - 40.25 - - 39.00 - 41.25
10kl1.2 umc1298-T 46.23 46.25 9.2 45.73 - 46.93 45.75 - 46.75
10kl1.2 umc1298-M 45.37 46.25 11.0 42.24 - 51.63 45.75 - 46.75
cupr1.1 PZD00119-T 30.56 29.00 4.1 25.16 - 35.33 22.50 - 33.25
cupr1.1 PZD00119-M - 29.00 - - 22.50 - 33.25
cupr1.2 umc1914-T 48.99 46.50 12.8 45.79 - 51.78 45.25 - 49.25
cupr1.2 umc1914-M 46.18 46.50 18.9 44.43 - 49.20 45.25 - 49.25
a "M" indicates that the marker was fixed for the maize allele and "T" indicates that the marker
was fixed for the teosinte allele.
bLOD scores obtained with a subset of the full data in which one of the two QTL affecting the
trait was fixed for a single genotype.
68
Figure 4: Mean phenotypic values of four genotypic classes. The x-axis denotes the number of
lines representing each genotypic class (N) and the alleles (maize: M, teosinte: T) for the closest
marker to the QTL effecting the trait and tb1 respectively. A and C (etb1.2) closest marker is
umc1298. B and D (etb1.1) closest marker is PZD00116. Eight of the 153 lines had missing data
for PZD00116 and were not included. Error bars represent the standard error for each genotypic
class. Traits are abbreviated as follows: 10-kernel length (10KL, in mm), cupules per rank
(CUPR), ear diameter (ED, in cm), staminate spikelets (STAM, percent).
69
A.
MM TM MT TTGenotype
42
44
46
48
50
52
5410kl1.1 (etb1.2)10kl1.2 (tb1)
B.
26
27
28
29
30
31
32
MM TM MT TT
cupr1.1 (etb1.1)cupr1.2 (tb1)
C.
0
2
4
6
8
10
12stam1.1 (etb1.2)tb1
33
34
35
36
37
38
39D.
ed1.1 (etb1.1)ed1.2 (tb1)
N 96 4 9 44 80 18 15 32
MM TM MT TTGenotype MM TM MT TT
N 96 4 9 44 80 18 15 32
70
only has an effect on phenotype when the teosinte allele is present at cupr1.2 because of an
epistatic interaction between these two QTL. This epistatic interaction is plainly visible when the
mean CUPR values for the different two-locus genotypic classes are compared (Figure 4B).
For cupr1.2, two subsets of the data were analyzed: lines fixed for the maize allele at
cupr1.1, and lines fixed for the teosinte allele at cupr1.1. When cupr1.1 is fixed for the teosinte
allele, cupr1.2 is still detected and it is located in the same position (Table 3). When cupr1.1 is
fixed for the maize allele, cupr1.2 is still detected but it is shifted in position to fall over tb1.
Thus, the presence of cupr1.2 is confirmed, and this analysis shows that the effects are
independent of the allelic composition at cupr1.1. However, the conflicting results on its position
indicate some uncertainty about its exact location. One possibility is that cupr1.2 is located at
tb1.
We also reassessed the position of stam1.1. The 1.5 LOD interval for this QTL includes
tb1; however, the maximum LOD is located near 10kl1.1. We evaluated whether a model
involving two linked QTL for STAM would best explain the data. Since tb1 was expected to
affect STAM based on the known effects of the tb1 mutant allele (Doebley et al. 1997), we
considered a model with one QTL at tb1 and a second QTL at the position of the LOD maximum
for stam1.1. The original analysis may have failed to define two separate QTL because of their
proximity to one another. We examined the mean values for STAM of the four genotypic
combinations of stam1.1 and tb1 (Figure 4C). From this figure, stam1.1 only has an effect on
phenotype when there is a teosinte allele at tb1. However, tb1 has a strong effect on phenotype
whether stam1.1 is fixed for the maize or the teosinte allele. The highest value for STAM is
obtained when there are teosinte alleles at both tb1 and stam1.1. These results suggest that there
are two QTL interacting to control STAM: stam1.1 with an effect that is dependent on the
71
teosinte allele at tb1; and tb1 with an effect regardless of the genotype of stam1.1. Thus, tb1 is
epistatic to stam1.1.
Discussion
Clark et al. (2006) studied how a ~54 cM teosinte chromosome segment encompassing the tb1
gene affected plant and ear architecture when it was introgressed into maize inbred W22. Their
analyses enabled them to map a factor controlling these phenotypes to a region between ~58-69
kb upstream of the tb1 open reading frame. Their experiments demonstrated that this “control
region” has strong effects on phenotypes, but they did not formally exclude the possibility that
there are other linked QTL in the introgressed teosinte chromosome segment.
A closer examination of the results by Clark et al. (2006) suggests that there may be other
QTL linked to tb1. Their results do indicate the tb1 control region explains all effects on plant
architecture. For example, their full ~54 cM (~59 Mbp) introgression segment (I01) has effects
on plant architecture that are indistinguishable from those of a partial <1 cM introgression
containing only ~69 kb surrounding the tb1 control region (I16). However, for traits related to
ear architecture, their results appear more complex. For example, their smaller introgression
(I16) appears to have a weaker effect on CUPR than their full ~54 cM introgression. In general,
their introgression lines containing larger segments of the teosinte genome appear to have
stronger effects on ear traits (CUPR and 10XCUP) than their introgression lines possessing
smaller introgressed segments that are shortened on either the proximal or distal side of the tb1
control region. These observations suggest that there are additional QTL linked to tb1 with
72
effects on ear traits. However, due to their experimental design, a direct comparison of
introgressed segments could not be made.
To determine if there are additional QTL for plant and ear architecture linked to tb1, we
performed an experiment (Experiment I) to test whether a minimal introgression of the tb1
control region (I16) was sufficient to produce the same phenotypes as a full introgression
segment (I01) that extended both proximal and distal to tb1. The results of Experiment I showed
no difference among the genotypic classes in the F2 population derived from the I01 I16 cross
for plant architecture phenotypes (LBIL, TILL, and TILN). This result indicates that the control
region identified by Clark et al. (2006) is the only QTL for plant architecture located in the
introgressed chromosome segment. However, the three genotypic classes in the F2 population did
differ from one another for the ear morphology phenotypes (ED and 10KL). Thus, Experiment I
indicates that there are additional QTL linked to the tb1 control region that affect ear
morphology.
To map these additional QTL, a second experiment (Experiment II) was done using a set
of NIRILs. These lines contained recombinant chromosomes with cross-overs throughout the
region, giving us the ability to map QTL to relatively small intervals. The replicated design of
the experiment gave relatively high heritabilities for the traits, providing power to detect QTL
with modest effects. Because our study used lines that are isogenic except for the ~63 cM region
surrounding tb1, there were no QTL segregating in other regions of the genome that could
complicate our ability to detect QTL near tb1. The analysis of these NIRILs confirmed that tb1 is
a large effect QTL for seven of the eight traits analyzed (10KL, CUPR, ED, LBIL, STAM, TILL,
and TILN). In particular, our results indicate that tb1 is the only QTL for plant architecture traits
including: LBIL, TILL, and TILN. These results are consistent with previous studies (Doebley et
73
al. 1997; Clark et al. 2006). However, our analysis also detected several additional linked QTL
located proximal to tb1, two of which interact epistatically with tb1. These additional QTL only
affect ear traits (10KL, CUPR, ED, EL, and STAM).
Based on the results of Experiments I and II, we proposed a model for the number and
positions of QTL in the introgressed segment (Figure 5). This model assumes that tb1 and its
neighboring QTL have pleiotropic effects on multiple traits. From prior work, it is known that
tb1 has pleiotropic effects on 10KL, CUPR, LBIL, STAM, TILL, and TILN (Doebley et al.
1997; Clark et al. 2006). Our data confirm these observations as we detected pleiotropic effects
of a QTL at tb1 on all these traits as well as ED. Thus, we hypothesize that tb1 is a QTL for
seven traits (Figure 5). Our analyses reveal another QTL 14 cM proximal of tb1 with effects on
CUPR and ED, and which interacts epistatically with tb1. We designate this QTL enhancer of
tb1.1 (etb1.1). Our analyses also revealed another QTL 6 cM proximal of tb1 with effects on
10KL and STAM, and which interacts epistatically with tb1. We designate this QTL enhancer of
tb1.2 (etb1.2). The epistatic interaction of tb1 with both etb1.1 and etb1.2 is plainly visible in
Figure 4. These results suggest the epistatic interaction between tb1 and etb1.2 (Figure 4A, C) is
stronger than the interaction between tb1 and etb1.1 (Figure 4B, D). Two additional QTL exist in
the introgressed segment: ed1.1 and el1.1. These two QTL each affect a single trait (ED and EL,
respectively) and neither shows an epistatic interaction with tb1. Together these five QTL
explain the maize-like versus teosinte-like phenotypes of the two parental lines (W22 and W22-
T1L).
The data from Clark et al. (2006) also suggest the presence of additional QTL effecting
CUPR downstream of tb1. For example, introgression lines containing the full teosinte
74
Figure 5: Map of the five QTL in our working model. Arrows indicate estimated positions of
each QTL. Traits listed correspond to the phenotypes that map to each QTL. Flanking markers
are included for reference. Traits are abbreviated as follows: 10-kernel length (10KL, in mm),
cupules per rank (CUPR), ear diameter (ED, in cm), ear length (EL, in cm), lateral branch
internode length (LBIL, in cm), staminate spikelets (STAM, percent), tillering (TILL), and tiller
number (TILN).
75
76
introgressed segment (I01) have a strong effect on CUPR, while lines with the maize allele
downstream of tb1 have a weaker effect. An additional QTL downstream of tb1 contributing to
CUPR may be the reason that cupr1.2 was not located directly over tb1 in our experiment, but
instead peaked distal to tb1 (Figure 3). It is possible that our analysis did not identify a distal
QTL because of its proximity to tb1, and/or its effect size on CUPR.
Two QTL (etb1.1 and etb1.2) identified in our experiments interact epistatically with tb1
(Figure 4). Such epistatic interactions are generally difficult to detect in QTL mapping studies
(Mackay et al. 2009), and thus the amount of epistasis detected in QTL mapping experiments
varies from study to study (Flint and Mackay 2009). A QTL mapping experiment for flowering
time in maize demonstrated that epistasis has a negligible effect on this trait, while other
examples in the literature from Arabidopsis, flies, mice, and rice show large epistatic effects for
various traits (Buckler et al. 2009; Flint and Mackay 2009).
There are at least two reasons that we were able to detect epistatic QTL. First, our
experiments focused on a relatively small genomic region. Thus, we did not suffer the loss of
statistical power that comes along with performing a large number of pair-wise tests of epistasis
as occurs with whole-genome scans for epistasis (Holland 2007). Second, the epistatic
interactions detected in our analyses have relatively large effect sizes so that relatively little
statistical power is needed to reject a false null hypothesis (Table 2). It may also be important
that maize and teosinte diverged 10,000 generation ago and maintain separate gene pools and
evolutionary trajectories. Thus, over time, maize and teosinte may have been selected for specific
combinations of alleles at multiple loci, one combination adapted to natural conditions and the
other to agricultural circumstances.
77
Both experiments I and II support the hypothesis that there are additional QTL linked to a
major domestication locus (tb1). We detect these additional QTL in our teosinte × W22 mapping
populations. It is unknown whether these QTL were involved in maize domestication or simply
differentiate the maize inbred W22 and our specific teosinte parents. We do not know if these
QTL would have been detected had we used a different modern maize inbred or even a primitive
maize variety. To address this possibility, we are currently attempting to clone etb1.2. Once the
gene underlying etb1.2 has been identified, we will have a critical tool for investigating its
potential role in maize domestication and its interaction with tb1.
78
References
Broman K.W., H. Wu, S. Sen and G.A. Churchill, 2003 R/qtl: QTL mapping in experimental
crosses. Bioinformatics 19: 889-890.
Buckler, E. S., J. B. Holland, P. J. Bradbury, C. Acharya, P. J. Brown et al., 2009 The genetic
architecture of maize flowering time. Science 325: 714–718.
Clark, R, T. Nussbaum-Wagler, P. Quijada and J. Doebley, 2006 A distant upstream enhancer at
the maize domestication gene, tb1, has pleiotropic effects on plant and inflorescence
architecture. Nature Genetics 38: 594-597.
Doebley, J. and A. Stec, 1991 Genetic analysis of the morphological differences between maize
and teosinte. Genetics 129: 285-295.
Doebley, J. and A. Stec, 1993 Inheritance of the morphological differences between maize and
teosinte: comparison of results for two F2 populations. Genetics 134: 559-570.
Doebley, J., A. Stec and C. Gustus, 1995 teosinte branched1 and the origin of maize: evidence
for epistasis and the evolution of dominance. Genetics 141: 333-346.
Doebley, J., A. Stec and L. Hubbard, 1997 The evolution of apical dominance in maize. Nature
386: 485-488.
Doerge R. W. and G. A. Churchill, 1996 Permutation tests for multiple loci affecting a
quantitative character. Genetics 142: 285-294.
Flint, J., and T. F. C. Mackay, 2009 Genetic architectures of quantitative traits in flies, mice and
human. Genome Research. 19: 723–733.
Haley, C. S. and S. A. Knott, 1992 A simple regression method for mapping quantitative trait
loci in line crosses using flanking markers. Heredity 69: 315–324.
79
Holland, J. B., 2007 Genetic architecture of complex traits in plants. Current Opinion in Plant
Biology 10: 156-161.
Lander E.S. and D. Botstein, 1989 Mapping Mendelian factors underlying quantitative traits
using RFLP linkage maps. Genetics 121: 185-199.
Littel, R.C., G.A. Milliken, W.W. Stroup and R.D. Wolfinger, 1996 SAS system for mixed
models. SAS Institute, Cary, NC.
Mackay, T. F. C., E. A. Stone and J. F. Ayroles, 2009 The genetics of quantitative traits:
challenges and prospects. Nature Reviews Genetics 10: 565-577.
Noor, M. A. F., A. L. Cunningham and J. C. Larkin, 2001 Consequences of recombination rate
variation on quantitative trait locus mapping studies: simulations based on the Drosophila
melanogaster genome. Genetics 159: 581–588.
80
Table S1: RFLP Markers used during backcrossing of T1L in Experiment II
Marker Chromosome Marker Chromosome
bnl5.62 1 umc2a 3
umc157 1 php20725 4
umc37b 1 umc19 4
npi255 1 umc127a 4
BZ2 1 bnl10.17b 4
bnl8.10 1 umc15 4
npi615 1 bnl8.23 4
umc107 1 bnl8.33 5
npi225 1 bnl6.25 5
bnl8.45 2 umc90 5
umc53 2 umc27 5
npi320 2 umc166 5
npi421 2 bnl7.71 5
umc6 2 npi412 5
umc34 2 umc54 5
umc134 2 umc127b 5
umc131 2 umc104a 5
umc2b 2 bnl6.29 6
umc5a 2 umc65 6
php20005 2 umc21 6
umc122 2 umc46 6
umc49a 2 umc132 6
umc36 2 umc62 6
umc32 3 npi114 8
umc121 3 bnl9.11 8
php20042 3 umc117 8
umc42b 3 umc7 8
umc161 3 npi253 9
umc18 3 umc113 9
TE1 3 umc81 9
bnl5.37 3 umc95 9
bnl8.01 3 bnl3.04 10
umc60 3 umc130 10
bnl12.97 3 umc49b 10
php10080 3 umc117b 10
npi425 3 bnl7.49 10
81
Chapter 3
Evidence for a natural allelic series at the maize domestication gene
teosinte branched1
82
Abstract
Despite numerous quantitative trait loci and association mapping studies, our understanding of
the extent to which natural allelic series contribute to the variation of complex traits is limited. In
this study, we investigate the occurrence of a natural allelic series for complex traits at the
teosinte branched1 (tb1) gene in natural populations of teosinte (Zea mays ssp. parviglumis, Z.
mays ssp. mexicana, and Z. diploperennis). Previously, tb1 was shown to confer large effects on
both plant architecture and ear morphology traits. tb1 has been studied extensively as a key gene
involved in the domestication of maize from teosinte; however, the effect of tb1 on trait variation
in natural populations of teosintes has not been investigated. We compare the effects of nine
teosinte alleles of tb1 that were introgressed into an isogenic maize inbred background. Our
results provide evidence for a natural allelic series at tb1 for several complex morphological
traits. The teosinte introgressions separate into three distinct phenotypic classes, which
correspond to the taxonomic origin of the alleles. The effects of the three allelic classes also
correspond to known morphological differences between the teosinte taxa. Our results suggest
that tb1 contributed to the morphological diversification of teosinte taxa as well as to the
domestication of maize.
83
Introduction
Over the past several decades, there has been considerable interest in the genetic architecture of
trait variation in natural populations as defined by number of genes involved and the effect sizes
of these genes (Tanksley 1993; Mackay 2001). A key component of this issue is how variation is
structured at individual genes. Are genes typically biallelic, like Mendel’s classic loci, or do
genes often harbor “allelic series,” i.e. multiple alleles with measurably different effects on
traits? While allelic series are known for pigmentation and other simple phenotypic traits, such
as the extension allelic series controlling coat color in rabbits (Fontanesi et al. 2006), allelic
series for complex morphological traits are not well-documented. The unambiguous
documentation of a natural allelic series for complex traits would further the understanding of the
genetic architecture of variation in natural populations.
Maize and its wild relatives, the teosintes, are an attractive system for the study of natural
variation and complex traits. Maize and the teosintes belong to the genus Zea which has four
species that are native to Mexico and Central America: Z. perennis, Z. luxurians, Z.
diploperennis and Z. mays (Doebley and Iltis 1980). The latter species includes four subspecies:
one for domesticated maize (ssp. mays) plus three subspecies for teosinte (sspp. parviglumis,
mexicana and huehuetenangensis), each with a distinct eco-geographic distribution. Of these
three wild subspecies, ssp. parviglumis has been identified as the wild progenitor of maize
(Doebley 2004). Since these teosinte taxa are interfertile with maize, one can leverage the
genetic tools of maize to study variation in teosinte. Some of these teosinte taxa are widespread
and contain abundant natural genetic variation. The teosintes are an appealing gene pool in
which one could search for natural allelic series for complex traits.
84
Among the ~35,000 maize genes, an attractive candidate for the study of natural allelic
series is teosinte branched1 (tb1). This gene controls plant architecture (apical dominance) and
ear morphology (Doebley et al. 1997). tb1 is a member of the TCP family of transcription factors
(Cubas et al. 1999), and it is one of the key genes involved in the domestication of maize
(Doebley 2004). During maize domestication, ancient farmers selected an allele of tb1 that is
expressed about twice as strongly as most teosinte alleles. The factor controlling this difference
in gene expression has been mapped to a regulatory region 58 to 69 kb upstream of the tb1 ORF
(Clark et al. 2006). Since teosinte possessed natural allelic variation at tb1 upon which ancient
farmers could apply selection, it seems plausible that teosinte might contain a natural allelic
series at this gene for traits related to plant architecture and ear morphology.
In this paper, we present evidence for a natural allelic series at tb1. We introgressed nine
teosinte chromosomal segments encompassing tb1 into the isogenic background of a maize
inbred line. These tb1 alleles included four from Z. mays ssp. mexicana, four Z. mays ssp.
parviglumis and one Z. diploperennis. We compare the effects of these introgressions to one
another and to a maize reference allele for four morphological traits previously shown to be
controlled by this gene (Clark et al. 2006). We show that the teosinte introgressions separate into
three distinct phenotypic classes and that these classes correspond to the taxonomic origin of the
alleles. Moreover, the effects of the alleles match the known morphological differences between
these taxa. Our results suggest that tb1, which contributed to maize domestication, also played a
role in the morphological divergence of teosinte taxa.
85
Materials and Methods
Plant materials
Segments of the long arm of chromosome 1 from nine different teosintes (IS1-9, Table S1) were
introgressed into a maize inbred W22 background via six generations of backcrossing. During
the backcrossing process, RFLP markers (NPI615, umc140, tb1, umc107, bnl15.18, kn1)
flanking tb1 were used to follow the target segment. After the 6th
generation of backcrossing, the
BC6 plants were selfed and PCR-based markers were used to map each of the teosinte
introgressed chromosomal segments (Figure 1, Table S2). A population of 120 BC6S1 plants
from each of the nine introgression stocks was grown and genotyped using two PCR-based indel
markers (Table S2). We then selected ~25 plants of each homozygous genotypic class
(homozygous maize, homozygous teosinte) from each of the nine populations for phenotypic
analysis.
Phenotypic data collection and analysis
Plants were grown at the University of Wisconsin West Madison Agricultural Research Station,
Madison, WI, USA during the summer of 2009. BC6S1 plants segregating for the introgressed
teosinte chromosomal segments were planted in a randomized grid with 0.9 meter spacing
between plants in both dimensions. This spacing minimized the degree to which plants shaded
their neighbors. Using BC6S1 plants allowed us to compare individuals containing the
introgressed teosinte chromosomal segments with individuals homozygous for the W22 segment.
Seed for each of the nine populations was obtained from a single ear, thus eliminating any
concern that differences among genotypic classes within a population are due to ear-parent
86
Figure 1: Physical map of the introgression lines. All introgressed segments are drawn to scale,
and vertical dotted lines show AGPv2 reference position (Mb). Shaded areas indicate teosinte
chromosome segments based on taxonomic origin: (blue) Zea diploperennis, (red) Z. mays ssp.
parviglumis, and (green) Z. mays ssp. mexicana; unshaded areas represent maize chromosome
segments. Markers used for genotyping are shown along the chromosomes as solid black lines
and listed in Table S2. The position of tb1 is shown for reference.
87
IS1
IS9
Chromosome 1L
88
effects.
The following five traits were phenotyped: cupules per rank (CUPR; number of cupules
in a single rank from base to the tip of the ear), lateral branch internode length (LBIL; mean
internode length, in cm, of the uppermost lateral branch), tillering (TILL; the ratio of the sum of
tiller heights/plant height), and percent staminate spikelets (STAM; percentage of male spikelets
in the inflorescence). CUPR and STAM were both measured on the uppermost, well-formed
lateral inflorescence (ear) of each plant.
The data were analyzed using the software package JMP IN 4.0.4, for calculating means,
standard errors, ANOVAs, Levene's tests, and principal component analysis (PCA). The PCA
was calculated based on the trait correlation matrix.
Nucleotide sequence analysis
Polymerase chain reaction (PCR) was done using Qiagen Taq DNA Polymerase following the
manufactures instructions and standard methods. One primer set was used to amplify the coding
region (GGACATATGAGTAGGCCACACTCCTCC, GATTTGCAGCTCATCAAGAAA) and
two additional internal primers were used to sequence the PCR product
(TCATGGACAACGATGAGTGG, CCAAGAAAATCGGCCAATAA). Two primer sets were
used to amplify the control region (CGGTCAAAGAGTAGGGCAAG,
GCGTCTGTTCCGCATTCA and ACTCAACGGCAGCAGCTACCTA,
CGTGTGTGTGATCGAATGGT). Sequencing of PCR fragments was done using Applied
Biosystems (ABI) BigDye and an ABI 3730xl DNA Analyzers at the University of Wisconsin
Biotechnology Center DNA Sequencing Facility. Initial alignment of nucleotide sequences was
performed using ClustalW (Thompson et al. 1994) and then finished by hand. Neighbor Joining
89
Trees were constructed in PAUP 4.0b10 (Swofford 2003) using the absolute number of
differences after gaps and missing or ambiguous bases were removed from the alignment.
Results
Allele means and variances: To test each of the tb1 teosinte chromosomal segments for an
effect on phenotype relative to the control W22 chromosome segment, the mean of each
genotypic class for each of the nine introgression lines was plotted for the four phenotypes
(Figure 2): the number of cupules per rank (CUPR), the internode length on the uppermost lateral
branch (LBIL), the amount of tillering (TILL), and the percent staminate spikelets on the primary
lateral inflorescence (STAM). These traits represent ear morphology (CUPR and STAM) and
plant architecture (LBIL and TILL), which are some of the major morphological differences
between maize and teosinte. One of the teosinte introgressed segment (IS6) does not have data
for ear morphology traits because all of the ears (from introgression and control plants) were
sterile (without kernels).
Our initial question was: do the introgressed tb1 teosinte chromosomal segments confer a
different phenotype than the maize (W22) control segment? For cupules per rank (CUPR), an
ANOVA indicates that plants containing a teosinte chromosomal segment have a significant
decrease in the number of cupules compared to plants with a W22 segment (F= 14.43; P-value =
0.0058; Figure 2). This is the expected result, given that teosinte has fewer kernels per ear than
maize. The average additive effect of a teosinte introgressed segment (IS) is -1.8 cupules per
rank with a range from -0.2 for IS4, to -3.4 for IS3 (Figure 3).
The second question we asked was: do teosinte segments confer a different number of
90
Figure 2: Phenotypic means. Points are shaded based on taxonomic origin of the tb1
introgressed segment: (purple) Zea mays ssp. mays control lines, (blue) Z. diploperennis, (red) Z.
mays ssp. parviglumis, and (green) Z. mays ssp. mexicana. Error bars represent the standard error
for each genotypic class. The x-axis shows the introgression segments; the y-axis shows trait
means.
91
1 2 3 4 5 6 7 8 9
26
30
28
32
24
22
1 2 3 4 5 6 7 8 9
4
8
6
10
2
0
4
8
6
10
1 2 3 4 5 6 7 8 9
5
7
9
1 2 3 4 5 6 7 8 90
2
1
3
2.5
0.5
1.5
92
Figure 3: Additive effects. Traits are abbreviated as follows: cupules per rank (CUPR), lateral
branch internode length (LBIL, in cm), staminate spikelets (STAM, percent), and tillering
(TILL). The x-axis shows the introgression segments; the y-axis shows additive effects. Error
bars represent the standard deviation for each genotypic class. Bars are shaded based on
taxonomic origin: (blue) Zea diploperennis, (red) Z. mays ssp. parviglumis, and (green) Z. mays
ssp. mexicana.
93
CUPR
4 2 5 9 7 8 1 3
-2.0
-3.0
-4.0
-1.0
0.0
STAM
4 5 7 9 2 3 8 1
1.0
0.5
0.0
-0.5
1.5
2.0LBIL
5 3 9 6 2 1 4 8 7
6 5 8 4 2 9 3 7 1
TILL1.0
0.5
0.0
0.75
0.25
IS Lines IS Lines
2.0
1.0
0.0
-1.0
3.0
4.0
IS Lines IS Lines
94
cupules per rank relative to one another? An ANOVA indicates significant differences among
the teosinte introgressions (F= 32.42; P-value <0.0001; Figure 2). For example, IS1 has a mean
CUPR of 23.0, while IS4 has a mean CUPR of 29.1. In contrast, two teosinte introgressions (IS2
and IS4) have additive effects statistically equivalent to 0.0, indicating that some teosinte
segments have equivalent effects to the W22 segment. These results suggest that different
introgressed tb1 teosinte chromosomal segments possess distinct genetic factors which confer
different numbers of cupules per rank.
Finally, we asked: do the control plants for different introgressions (i.e. plants
homozygous maize W22 for the segment) confer different numbers of cupules relative to one
another? The expectation is that each of the recovered control lines will be phenotypically
equivalent since they all carry the W22 segment, even though they are products of different
backcrossing populations. Consistent with this expectation, there are no significant differences
among the W22 control lines (F= 1.51; P-value = 0.17). On average, the W22 control lines have
29.8 CUPR with a range of 29.4 for IS7 to 30.5 for IS8. Because there are significant differences
among teosinte introgressions but not among control lines, these results argue that there are
“allelic” differences among teosinte introgressions for CUPR.
A different pattern of variation was observed for the proportion of staminate spikelets
(STAM) in the primary lateral inflorescence (Figure 2). STAM is considered a domestication
trait because a teosinte plant has a primary lateral branch that is tipped in a tassel whereas in
maize this branch is tipped by an ear. The majority of teosinte introgressions and all of the
recovered W22 control lines produce ears with less than 1% staminate spikelets. However, there
are three teosinte introgressions that show an additive effect of greater than 1% staminate
spikelets with IS1 having the highest additive effect of 3% (Figure 3). Despite these observed
95
differences, an ANOVA narrowly failed to support a difference between the teosinte
introgressions and the W22 control lines (F= 4.84; P-value = 0.0637) for STAM. However, an
ANOVA that compares teosinte introgressions with >1% STAM to their matched W22 control
lines showed a significant difference between these groups (F= 117.523; P-value = 0.0004). A
significant difference among teosinte introgressions was also observed (F= 7.13; P-value
<0.0001), and there is no significant variation among the W22 control lines (F= 1.01; P-value =
0.43). Overall, these results reveal significant differences among the lines with teosinte
chromosomal segments but not among control lines, and thus indicate that there are “allelic”
differences among the teosinte introgressions for STAM.
For tillering, plants with a tb1 teosinte chromosomal segment show greater tillering than
the corresponding control plants carrying the W22 segment (Figure 2). This increase in tillering
is consistent with the known difference in tillering between maize and teosinte. An ANOVA
indicates that plants containing a teosinte chromosomal segment have an increase in the amount
of tillering relative to plants with a W22 segment (F= 70.87; P-value <0.0001). The average
additive effect of a teosinte introgressed segment is an increase in tillering by 0.6 with a range of
0.4 for IS6, to 0.7 for IS1. An ANOVA indicates significant differences among the teosinte
introgressions (F= 6.22; P-value <0.0001). For example, IS6 has a mean tillering score of 1.8
while IS3 has a mean tillering score of 2.8 (Figure 2). These results suggest that the different
introgressed tb1 teosinte chromosomal segments possess distinct genetic factors which confer
different amounts of tillering. However, tillering is different from the previous two traits
examined because contrary to expectation, there is considerable heterogeneity among the W22
control lines (F= 5.77; P-value <0.0001). On average, the W22 control lines have a tillering
score of 1.2 with a range of 0.7 for IS7, to 1.5 for IS3 and IS8. These results suggest that there
96
are factors other than tb1 affecting phenotype. Therefore, we cannot conclude that the observed
differences in tillering among the teosinte introgressions are caused by tb1 alone.
Since significant heterogeneity for tillering was observed among the control lines, two
possible explanations were considered for this unexpected result. First, significant phenotypic
differences between control lines could be observed if additional genetic factors segregated
between backcrossing populations at other loci in the genome unlinked to the target segment
encompassing tb1. Such factors would increase (or decrease) the trait mean for both plants with
the introgressed teosinte chromosomal segment and the corresponding control plants. Therefore,
it is important to only compare the phenotypes of plants containing teosinte segments to W22
control lines that originate from the same backcrossing population. Another explanation is that
environmentally determined seed quality differs among the ear-parents for different
introgressions. This is particularly possible since only a single ear parent was used for each
introgression. Ear-parent effects such as seed weight, seed maturity and speed of germination can
influence adult phenotype. Thus, environmentally induced ear-parent effects could account for
the differences seen among the control lines derived from different ears. This result highlights
the importance of only comparing the phenotypes of plants containing a particular teosinte
segment to the W22 control plants derived from the same ear.
Given the heterogeneity among W22 control lines for tillering, we asked: is the variance
among teosinte introgressions greater than among the control lines? This is the expectation if
there are multiple functionally different teosinte alleles at tb1 effecting tillering. To answer this
question we compared the variances among the teosinte introgressions to that among the control
lines for TILL using Levene's test. This test showed an equal variance among teosinte
introgressions and control lines (F= 0.0089; P-value = 0.9262). This result does not support an
97
allelic series at tb1 for tillering because the variances among the different teosinte introgressions
and the control lines are equivalent.
For lateral branch internode length (LBIL), the teosinte introgressions have longer
branches than the maize control lines with one exception – IS5 has branches equivalent in length
to its matched control line (Figure 2). An ANOVA indicates that overall plants containing a
teosinte tb1 chromosomal segment have longer lateral branch internodes than plants with a W22
segment (F= 15.7039; P-value = 0.0026). The average additive effect of a teosinte introgression
segment on internode length is 0.7 cm with a range of -0.1 cm for IS5, to 1.5 cm for IS7. Most
teosinte introgressions confer longer internodes resulting in a longer lateral branch as seen in
teosinte as compared to maize. However as mentioned above, one introgression (IS5) has an
additive effect that is statically equivalent to 0.0. These results suggest that the severity of the
phenotype may vary depending on the specific teosinte chromosomal segment at tb1. An
ANOVA indicates that there are significant differences among lines containing different teosinte
tb1 segments for lateral branch internode length (F= 21.01; P-value <0.0001). However, there is
also a significant difference among W22 control lines (F= 8.04; P-value <0.0001). Thus,
differences among the teosinte introgression lines could be solely the result of ear-parent effects
as discussed above.
Since there were significant differences among W22 control lines for LBIL, we used the
Levene's test to ask whether the variance among teosinte introgressions for LBIL is equivalent to
the variance among the control lines as expected if there is not an allelic series at tb1. This test
indicates that there is greater variance among teosinte introgressions as compared to the control
lines (F= 5.084; P-value = 0.0385), suggesting that the teosinte introgressions may possess
different allelic effects for LBIL. A graph of the additive effects for internode length highlights
98
the small effect of IS5, and the large effect IS7 has on internode length compared to the rest of
the teosinte introgressions (Figure 3).
Allelic Classes: To assess whether the different tb1 teosinte introgressions could be classified
into groups based on phenotype, a principal component analysis was performed using the four
traits as input data. IS6 was not included in the analysis because it lacks data for two of the four
traits. Two components were retained from the analysis, which explain 64% and 27% of the
observed variance, respectively. The ear morphology traits, cupules per rank and staminate
spikelets, load to component 1, which is represented by the x-axis in Figure 4. The plant
architecture traits, tillering and lateral branch internode length, load to component 2, which is
represented by the y-axis in Figure 4. The W22 control point plots to the lower left quadrant of
the graph with distance from this point corresponding to stronger teosinte-like phenotypes
(Figure 4).
The principal component analysis suggests that there are three classes of teosinte
introgressions. The first class is composed of a single introgression (IS5) which plots away from
the rest of the teosinte introgressions and is located in the quadrant containing the W22 control
point. This result suggests that IS5 is a weak "allele" and confers a phenotype that is only
modestly different from the W22 control. This relationship can also be observed by looking at
IS5 for each trait individually (Figures 2-3). The second class is composed of IS2, 4, 7, and 9, all
of which plot to the upper left quadrant (Figure 4). This quadrant represents teosinte
introgressions that have relatively strong effects on plant architecture traits (tillering and lateral
branch internode length), but little effect on ear morphology traits (cupules per rank and
staminate spikelets). The final class is composed of IS1, 3, and 8 and occupies the right half of
the graph along the x-axis (Figure 4). These teosinte introgressions have strong effects on ear
99
Figure 4: Principal components plot. The x-axis shows component 1which represents ear
morphology traits; the y-axis shows component 2 which represents plant morphology traits. Dots
are shaded based on taxonomic origin: (purple) Zea mays ssp. mays, (blue) Z. diploperennis,
(red) Z. mays ssp. parviglumis, and (green) Z. mays ssp. mexicana.
100
IS1 (PAR)
IS2 (MEX)
IS3 (PAR)
IS4 (MEX)
IS5 (DIP)
IS8 (PAR)IS9 (MEX)
W22 (MAY)
1 2
1
2
-1
-2
-1
IS7 (MEX)
Component 1
101
morphology traits as well as on plant architecture traits. In particular, IS1, 3, and 8 have a high
percentage of male spikelets (Figures 2-3).
Strikingly, the PCA reveals that the allelic classes correspond to the taxonomic origin of
each teosinte introgression (Figure 4). The allelic class with the strongest teosinte phenotype
corresponds to introgressions from Z. mays ssp. parviglumis (PAR). The allelic class with
moderate teosinte phenotypes corresponds to introgressions from the Z. mays ssp. mexicana
(MEX). Finally, the allelic class with the weakest teosinte phenotypes corresponds to the
introgression from the species Z. diploperennis (DIP). Thus, the allelic series at tb1 appears to
have a taxonomic basis. Because of the isogenic nature of the introgression lines, the allelic
series is not the result of population structure, but rather a difference at or near tb1.
Although the allele series shows a distinct taxonomic signature, we also asked whether
the allele classes were correlated with the length of the introgressed segments (Figure 1). No
obvious correlation between phenotype and introgression length is observed. For example, the
largest introgression (IS2) does not have the strongest effect on phenotype, nor does the smallest
introgression (IS9) have the weakest effect on phenotype (Figure 3). Moreover, different
introgression lengths are represented in the allelic classes defined in the PCA. This result
supports the conclusion of an allelic series at tb1, as opposed to other linked genes in the
introgressed segments causing the observed allelic differences.
To explore the possibility of a correlation between the nucleotide sequence variation in
tb1 and phenotype, we plotted the phenotypic classes defined by the PCA onto neighbor-joining
trees based upon two portions of the tb1 sequence (Figure 5). One portion is the protein coding
region of the gene and 3’ UTR, and the other corresponds to a known upstream regulatory region
of tb1 (Clark et al. 2006). The teosinte introgressions representing any single class defined by the
102
Figure 5: Phylogenetic trees. (A) A neighbor joining tree based on sequence from the tb1 coding
region. (B) A neighbor joining tree based on sequence from the tb1 upstream control region.
Text color is based on taxonomic origin: (purple) Zea mays ssp. mays, (blue) Z. diploperennis,
(red) Z. mays ssp. parviglumis, and (green) Z. mays ssp. mexicana.
103
B73 (MAY)W22 (MAY)
IS1 (PAR)IS3 (PAR)
IS4 (MEX)
IS5 (DIP)
IS6 (PAR)
IS7 (MEX)
IS8 (PAR)
IS9 (MEX)
IS2 (MEX)
1 change
A
IS2 (MEX)
IS5 (DIP)
1 change
BCoding Region Control Region
104
PCA are scattered across both of the phylogenetic trees, and for the most part, no relationship
between phylogeny and phenotype is apparent. For example, the class representing strong ear
trait phenotypes (IS1, 3, and 8) do not cluster in either phylogeny. One striking feature of both
phylogenies is that IS5, which was derived from a separate species (Z. diploperennis, DIP) and
has unique phenotypic effects, stands apart from all other introgressions in both trees. This result
suggests that the different phenotypes observed for IS5 when compared to the other
introgressions could be due to sequence differences in the upstream control region and/or the
coding sequence of tb1. Neither teosinte introgressions sampled from Z. mays ssp. parviglumis
(PAR) nor Z. mays ssp. mexicana (MEX) clustered on either of the phylogenetic trees.
Discussion
Natural allelic series for simple phenotypic traits such as pigmentation are well documented in
the literature. For example, five alleles have been described at the R locus in maize, which
control plant and kernel pigmentation. Each of these 5 alleles produces a distinct phenotype
based on pigment quantity, spatial patterning in kernels, the timing of pigmentation onset during
development, and which organs are pigmented (kernels, anthers, leafs and/or roots) (Styles et al.
1973). A similar allelic series for pigmentation has been described for the B locus of maize
(Styles et al. 1973; Radicella et al. 1992). Much like these examples from maize, an allelic series
for coat color in mice has been described (Phillips 1966; Jackson 1994). Alleles of the agouti
locus produce distinct coat colors and pattern differences due to factors in both the promoter and
coding region of the gene. Other than pigmentation, allelic series have been described for traits
such as self-incompatibility in plants (Nasralla et al. 1991; Takayama and Isogai 2005).
105
Evidence for natural allelic series for complex or morphological traits has come from
association mapping and QTL studies (Purugganan and Suddith 1998; Todesco et al. 2010;
McKechnie et al. 2010). For example, an allelic series for flowering time was reported among a
diverse set of maize lines which display significant variation in flowering time (Buckler et al.
2009). In this example, statistical evidence for an allelic series is shown; however there is no
formal proof that a single locus with multiple alleles explains the observed phenotypic series
since the occurrence of several tightly linked genes each with two alleles cannot be excluded.
Another concern with evidence for allelic series from QTL and association studies is that the
alleles are each typically characterized in a different genetic background. Thus, it is possible that
the QTL in question has only two alleles which form a large number of apparent allelic classes
based on the background in which they were assayed.
In this paper, we present evidence for a natural allelic series at tb1 for three complex
morphological traits: lateral branch internode length, the number of cupules per rank, and the
number of staminate spikelets (Figures 2-3). Our evidence for allelic series at tb1 largely
eliminates concerns about the influence of genetic background by using isogenic lines. We also
examined the role of linked genes on trait variation associated with tb1 by considering the length
of the introgressed chromosomal segment surrounding tb1 for each of the teosinte introgressions.
We saw no evidence that phenotype is correlated with the length of the introgression segment
(Figures 1, 4), arguing against a role for linked genes contributing to the observed phenotypic
variation. However, ideally, teosinte introgression segments of a uniform length which only
contain the tb1 gene itself should be compared.
The feature that is best correlated with the phenotypic effects of the tb1 alleles that we
examined is the taxonomic origin of these alleles. In a principal components analysis based on
106
phenotype, the eight teosinte introgressions form three classes that correspond to Z. mays ssp.
parviglumis, Z. mays ssp. mexicana and Z. diploperennis (Figure 4). This result not only supports
the existence of an allelic series at tb1, but it also implicates tb1 in the morphological
diversification of these taxa in addition to its role in maize domestication. There are several
enticing correspondences between known morphological differences between these taxa and the
effects associated with the alleles of tb1 we assayed. First, Z. mays ssp. mexicana has more
fruitcases (a greater CUPR value) per ear than either Z. mays ssp. parviglumis or Z.
diploperennis (Iltis and Doebley 1980), and our Z. mays ssp. mexicana alleles have greater
CUPR values than our Z. mays ssp. parviglumis and Z. diploperennis alleles (Figure 3). Second,
Z. diploperennis has shorter lateral branches that are tipped in a mixed male-female inflorescence
unlike other teosintes that have longer lateral branches tipped by tassels (Iltis et al. 1979;
Doebley and Iltis 1980). The one Z. diploperennis allele we assayed has the smallest value for
LBIL (shorter branches) of all nine teosinte alleles assayed (Figure 3).
Given the correlation between taxonomy and allelic effects (Figure 4), we examined the
nucleotide sequences of the control region and coding sequence of tb1 for fixed differences
between taxa that may not have been visible in the phylogenetic trees. No fixed differences were
found between Z. mays ssp. mexicana and Z. mays ssp. parviglumis individuals for either
sequenced region. The Z. diploperennis sequence is highly divergent from the other alleles with
many sequence differences. With such a large number of differences and only a single Z.
diploperennis sample, it is not possible to say which if any are potentially causative. However,
there are two polymorphisms unique to the Z. diploperennis allele of tb1 that cause radical amino
acid changes in the Helix II portion of the TCP domain, which is involved in DNA binding. Both
107
changes are from hydrophobic to hydrophilic amino acids, which could alter protein function.
Further experimentation is needed to test whether these amino acid differences affect phenotype.
In summary, our experiments provide evidence for a natural allelic series at tb1 with effects on
complex morphological traits. It has been previously shown that tb1 played a major role in the
domestication of maize from its wild progenitor, teosinte (Doebley 2004). Since the allelic
classes that we observed at tb1 correspond with taxonomic origin, tb1 may also have played a
role in the morphological diversification of Z. mays ssp. parviglumis, Z. mays ssp. mexicana and
Z. diploperennis. To provide formal proof of the allelic series at tb1 and verify its role in the
divergence of teosinte, the causal polymorphism underlying the phenotypic differences needs to
be identified.
108
References
Buckler, E. S., J. B. Holland, P. J. Bradbury, C. Acharya, P. J. Brown et al., 2009 The genetic
architecture of maize flowering time. Science 325: 714–718.
Clark, R. M., T. Nussbaum Wagler, P. Quijada and J. Doebley, 2006 A distant upstream
enhancer at the maize domestication gene tb1 has pleiotropic effects on plant and
inflorescent architecture. Nat. Genet. 38: 594-597.
Cubas, P., N. Lauter, J. Doebley and E. Coen, 1999 The TCP domain: a motif found in proteins
regulating plant growth and development. The Plant Journal 18: 215-222.
Doebley, J. F., and H. H. Iltis, 1980 Taxonomy of Zea (Gramineae). I. Subgeneric classification
with key to taxa. Amer. J. Bot. 67: 982-993.
Doebley, J., A. Stec and L. Hubbard, 1997 The evolution of apical dominance in maize. Nature
386: 485-488.
Doebley, J., 2004 The genetics of maize evolution. Annu. Rev. Genet. 38: 37-59.
Fontanesi, L., M. Tazzoli, F. Beretti and V. Russo, 2006 Mutations in the melanocortin 1
receptor (MC1R) gene are associated with coat colours in the domestic rabbit
(Oryctolagus cuniculus). Anim. Genet. 37: 489-493.
Iltis, H. H., and J. F. Doebley, 1980 Taxonomy of Zea (Gramineae). II. Subspecific categories in
the Zea mays complex with a generic synopsis. Amer. J. Bot. 67: 994-1004.
Jackson, I. J., 1994 Molecular and developmental genetics of mouse coat color. Annu. Rev.
Genet. 28: 189-217.
Mackay, T. F. C., 2001 The genetic architecture of quantitative traits. Annu. Rev. Genet. 35:
303-339.
109
McKechnie, S. W., M. J. Blacket, S. V. Song, L. Rako, X. Carroll, et al., 2010 A clinally varying
promoter polymorphism associated with adaptive variation in wing size in Drosophila.
Mol. Ecol. 19: 775-784.
Nasrallah, J. B., T. Nishio and M. E. Nasrallah, 1991 The self-incompatibility genes of
Brassica: Expression and use in genetic ablation of floral tissues. Annu. Rev. Plant.
Physiol. Plant. Mol. Biol. 42: 393-422.
Phillips, R. J. S., 1966 A cis-trans position effect at the A locus of the house mouse. Genetics 54:
485-495.
Purugganan, M. D., and J. I. Suddith, 1998 Molecular population genetics of the Arabidopsis
CAULIFLOWER regulatory gene: Nonneutral evolution and naturally occurring variation
in floral homeotic function. Proc. Natl. Acad. Sci. USA 95: 8130–8134.
Radicella, J. P., D. Brown, L. A. Tolar and V. L. Chandler, 1992 Allelic diversity of the maize
B regulatory gene: different leader and promoter sequences of two B alleles determine
distinct tissue specificities of anthocyanin production. Genes Dev. 6: 2152-2164.
Styles, E. D., O. Ceska and K.-T. Seah, 1973 Developmental differences in action of R and B
alleles in maize. Can. J. Genet. Cytol. 15: 59-72.
Swofford, D. L., 2003 PAUP*: Phylogenetic analysis using parsimony (* and other methods),
Version 4.0b10. Sinauer Associates, Sunderland, MA.
Takayama, S., and A. Isogai, 2005 Self-incompatibility in plants. Annu. Rev. Plant Biol. 56:
467-489.
Tanksley, S. D., 1993 Mapping polygenes. Annu. Rev. Genet. 27: 205-233.
110
Thompson, J. D., D. G. Higgins and T. J. Gibson, 1994 CLUSTAL W: improving the sensitivity
of progressive multiple sequence alignment through sequence weighting, position-
specific gap penalties and weight matrix choice. Nucl. Acids Res. 22: 4673-4680.
Todesco, M., S. Balasubramanian, T. T. Hu, M. B. Traw, M. Horton, et al., 2010 Natural allelic
variation underlying a major fitness trade-off in Arabidopsis thaliana. Nature 465: 632-
636.
11
1
Table S1: Introgressed teosinte germplasm
Line Species Subspecies Country State/
Province Population Collector Collection
Lat
Deg
Lat
Min
Long
Deg
Long
Min
IS1 mays parviglumis Mexico Guerrero 1 mile S of Palo Blanco Beadle &
Kato Site 4 17 25 -99 30
IS2 mays parviglumis Mexico Guerrero 30 km S of Chilpancingo Beadle &
Kato Site 2-3 17 12 -99 30
IS3 mays mexicana Mexico Mexico km 43 on hwy from Chalco to
Amecameca Iltis et al. 28622 19 6 -98 42
IS4 mays mexicana Mexico Jalisco 10 km S of Degollado M. Puga 11066 20 22 -102 11
IS5 diploperennis Mexico Jalisco Zarza Mora, 2 km E of Las
Joyas Iltis et al. 1250 19 35 -104 16
IS6 mays parviglumis Mexico Guerrero 1 km N of Mazatlan Beadle &
Kato Site 1 17 30 -99 30
IS7 mays mexicana Mexico Chihuahua Nobogame Beadle s.n. 26 6 -107 0
IS8 mays parviglumis Mexico Guerrero Sites 9-10, Teloloapan-Arcelia
Hwy
Iltis &
Cochrane 81 18 21 -100 12
IS9 mays mexicana Mexico Mexico km 1.8 WSW of Texcoco H. Iltis 28620 19 30 -98 55
112
Table S2: Markers for genotyping
Markers Forward Primer (5’ to 3’) Reverse Primer (5’ to 3’)
umc2569 a
GTGACACCCTAGCCCTCTTAGACA TAGCTGGAGTATGTCGTCTTGGTG
umc2237 a CTCAGCTACAGGAGCGAAGAGG GTCACTGCACGATCCATCACAT
umc1122 a CACAACTCCATCAGAGGACAGAGA CTGCTACGACATACGCAAGGC
umc2396 a TGCATCTTTAGCTACGAGACAACCT TGCATGCATTTTTAGGTTTGGAAT
bngl615 a CTTCCCTCTCCCCATCTCCTTTCCAA GCAACCTGTCCATTCTCACCAGAGGATT
bnlg1025 a TGGTGAAGGGGAAGATGAAG CCGAGACGTGACTCCTAAGC
bnlg1564 a ACGGGAGAACAAAAGGAAGG CTCTCCCTCACATCCGCC
bnlg1629 a GTTGGATGGAAAATTCTAGATCG TTGCGTCATTACAGCAGGAG
bnlg2228 a GCAGCAATCGACACGAGATA CTTGGATCGCACTCCGTC
umc2181 a ATCGGGTCCGGATAGATTTTACAC GTAGCTAGCTTAAGCAGTGCTCCG
mmc0041 a AGGACTTAGAGAGGAAACGAA TTTATCCTTACTTGCAGTTGC
umc1924 a GGATGCGGTCGTACAGTACAAGTAT CTACAACAACTGCTGCTCCCG
umc1991 a GAAATTGATGCAATTCACCCTGAT ATTGAATTGCGTGATGCAAGAGTA
umc1914 a CAACATGAGCGTGCTAAATACTCG ACAGGAACACATGAGGTCATCAAA
umc2047 a GACAGACATTCCTCGCTACCTGAT CTGCTAGCTACCAAACATTCCGAT
umc1298 a AGCTGAACAAAATAAACGGAACGA AGGACAAGAAAAAGAAGAAGCACG
PZD00117.indel1 a CCCGCGGCCCGCCGTCAAGT ATGCGCGGGCAAGCGCACCG
umc1306 a CGAAACAAAACACCCAGCAGTAGT CCAGGATGAATAAATCGTATTGCC
bnlg1502 a AGGTCCTGGCACTAAGAGCA AGAGGTGGTATGATCACCTGG
umc1082 a CCGACCATGCATAAGGTCTAGG GCCTGCATAGAGAGGTGGTATGAT
PZD00101.indel1 a ATCGACCAACCAACTTCTCG GCTTGGCAGTGGGTTAGTGT
umc1726 a GATGAGGAAGAAAAGGGAAAAGGA AGACTCAACCCTAACCCTAATGGG
bnlg1671 a TCACGATCAGCAAGCAATTC CCCCACCAACCTTAGAGTCA
umc1774 a ATGGGACTATGCATGGTATTTTGG TACACCATACGTCACCAGGTTCAC
umc2223 a ACTTCTGCAGAGCGAGCAGG TTTTGGGACTGAAGAAGAAGATCG
umc1500 a TCTCTGACTATTCCACGAGCTCAA CTGGTGCGTGCTACAACTGTG
umc1421 a TGCTACGAACTGGGATACACTCAA AGTGGTGAATGTGCCCTAGGAATA
GS1 b
ACACCGCCACCGACATCT TTGTCCCTGAACGGCCAATA
CR Indel c CGGTCAAAGAGTAGGGCAAG GCGTCTGTTCCGCATTCA
aMarkers used to map the introgressed teosinte segments
bDirectly labeled FAM genescan marker used to genotype IS3 F2 population.
cAgarose gel marker used to genotype all IS F2 populations except IS3.