Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
www.advances.sciencemag.org/cgi/content/full/1/6/e1400218/DC1
Supplementary Materials for
Genome-environment associations in sorghum landraces predict
adaptive traits
Jesse R. Lasky, Hari D. Upadhyaya, Punna Ramu, Santosh Deshpande, C. Tom Hash, Jason Bonnette,
Thomas E. Juenger, Katie Hyma, Charlotte Acharya, Sharon E. Mitchell, Edward S. Buckler, Zachary
Brenton, Stephen Kresovich, Geoffrey P. Morris
Published 3 July 2015, Sci. Adv. 1, e1400218 (2015)
DOI: 10.1126/sciadv.1400218
The PDF file includes:
Fig. S1. Landrace accessions included in the study classified into botanical races
based on morphological classification (five new world accessions not shown).
Fig. S2. Rainout shelter where plants were grown in Austin.
Fig. S3. Seedlings planted in the Austin experiment.
Fig. S4. Plants growing in Austin.
Fig. S5. A representative accession (IS 25836) under irrigated (left) and imposed
terminal drought (right) conditions at experimental plot in India.
Fig. S6. Proportion of total SNP variation among accessions with known
collection locations (excluding spatial outlier landraces from the Americas, China,
and Southeast Asia/Oceania) explained by spatial structure or environmental
variables.
Fig. S7. Predictions of phenotypes averaged across well-watered and drought
conditions from drought treatment across growing season in Austin, United
States.
Fig. S8. Predictions of phenotype change between well-watered and drought
conditions from drought treatment across growing season in Austin, United
States.
Fig. S9. Predictions of phenotypes averaged across well-watered and drought
conditions from drought treatment late in growing season in Hyderabad, India.
Fig. S10. Predictions of phenotype change between well-watered and drought
conditions from drought treatment late in growing season in Hyderabad, India.
Fig. S11. GWAS for harvest index plasticity (harvest index in wet/dry) in U.S.
experiment, using SNP associations with precipitation in the warmest quarter as a
prior.
Fig. S12. GWAS for panicle weight plasticity (panicle weight in wet − dry) in
India experiment, using SNP associations with growing season length as a prior.
Fig. S13. GWAS for root growth plasticity (growth in control/Al toxic) in
published aluminum toxicity experiment (44), using SNP associations with topsoil
pH as a prior.
References (78–81)
Other Supplementary Material for this manuscript includes the following:
(available at www.advances.sciencemag.org/cgi/content/full/1/6/e1400218/DC1)
Table S1 (Microsoft Excel format). Landraces studied and environment of origin
data.
Table S2 (Microsoft Excel format). Accession phenotypes from experiment in
Austin, United States.
Table S3 (Microsoft Excel format). Mean phenotypes across the 2 years of the
experiment in Hyderabad, India.
Table S4 (Microsoft Excel format). Spearman’s rank correlation test results for
two SNPs that tag known candidate genes potentially involved in local adaptation.
Table S5 (Microsoft Excel format). EMMA t test results for two SNPs that tag
known candidate genes potentially involved in local adaptation.
Table S6 (Microsoft Excel format). Predictions for each accession in the Austin
experiment based on SNP-environment associations in the landrace panel.
Table S7 (Microsoft Excel format). Predictions for each accession in the
Hyderabad experiment based on SNP-environment associations in the landrace
panel.
Table S8 (Microsoft Excel format). Predictions for each accession in the Caniato
et al. (44) experiment based on SNP-environment associations in the landrace
panel.
Table S9 (Microsoft Excel format). Predicted environments for accession in the
three experiments based on kinship associations with environment of landraces
(gBLUP).
Table S10 (Microsoft Excel format). Pearson’s correlations between predictions
based on environment-genome associations and phenotypes in Austin.
Table S11 (Microsoft Excel format). Pearson’s correlations between predictions
based on environment-genome associations and phenotypes in Hyderabad.
Table S12 (Microsoft Excel format). Pearson’s correlations between predictions
based on environment-genome associations and phenotypes in the Caniato et al.
(44) Al toxicity experiment.
Table S13 (Microsoft Excel format). Pearson’s correlations between phenotypes
and environment of origin (where known) for landraces in the Austin experiment.
Table S14 (Microsoft Excel format). Pearson’s correlations between phenotypes
and environment of origin (where known) for landraces in the Hyderabad
experiment.
Table S15 (Microsoft Excel format). Pearson’s correlations between relative net
root growth and environment of origin (where known) for landraces in the
Caniato et al. (44) experiment.
Table S16 (Microsoft Excel format). Pearson’s correlation coefficients between
predicted phenotypes in the Austin experiment and observed, where predictions
based on genome associations with phenotypes in fivefold cross-validation.
Table S17 (Microsoft Excel format). Pearson’s correlation coefficients between
predicted phenotypes in the Hyderabad experiment and observed, where
predictions based on genome associations with phenotypes in fivefold cross-
validation.
Table S18 (Microsoft Excel format). Pearson’s correlation coefficients between
predicted phenotypes in the Caniato et al. (44) experiment and observed, where
predictions based on genome associations with phenotypes in fivefold cross-
validation.
Table S19 (Microsoft Excel format). Top 1000 SNPs associated with harvest
index plasticity in Austin using SNP associations with precipitation of the
warmest quarter as priors.
Table S20 (Microsoft Excel format). Top 1000 SNPs associated with relative net
root growth (comparing control treatment with Al toxic treatment) in Caniato et
al. (44) experiment, using SNP associations with topsoil pH as priors.
Table S21 (Microsoft Excel format). Top 1000 SNPs associated with panicle
weight plasticity in Hyderabad, using SNP associations with growing season
length as priors.
Supplementary Materials
1. Data
1.1 Genotype data
In order to maximize geographic coverage of landraces, we started with the 469 landraces
in (23) and sequentially added landraces from the NPGS-GRIN (accessions with ‘PI' in
the prefix of their name) or ICRISAT gene banks (accessions with ‘IS' in the prefix of
their name). At each step, we selected the landrace with the greatest distance to the
nearest neighboring landrace that was already in the panel.
After imputation, an average of 15.6% of SNP calls remained missing (median =
10.7%, SD = 15.6% of accessions missing for each SNP; median = 12.5%, SD = 10.8%
of SNPs missing for each accession).
Figure S1. Landrace accessions included in the study classified into botanical races based on
morphological classification (five new world accessions not shown). Classification was obtained
from germplasm passport data obtained from Genesys or GRIN.
1.2 Environmental data
1.2.1 Climate data
VPD is the difference between water vapor partial pressure and maximum
potential pressure at a given air temperature, reflecting the evaporative demand
experienced by plants. We extracted mean monthly relative humidity and temperature
from CRU data (65) and calculated VPD at mean conditions (78).
Reanalysis data (66), which we used to calculate inter-annual precipitation
variability, were generated on a T62 grid (resolution ~ 210 km) for the years 1948-2009
(data provided by NOAA/OAR/ESRL PSD, http://www.esrl.noaa.gov/psd/).
1.2.2 Edaphic data
Estimated water capacity in (67) was based on soil texture, organic matter content, and
plant root depth or profile depth. Depth to water table was modeled by (68) in 30 arc-
second grid squares using a numerical model. We used the harmonized world soil
database (v. 1.21) to extract topsoil pH for the most common soil type in each 30 arc-
second grid cell (70).
1.2.3 Predicted growing seasons
The FAO criteria for growing season determination for dryland crops
(http://www.fao.org/nr/climpag/cropfor/lgp_en.asp;
http://www.fao.org/geonetwork/srv/en/metadata.show?id=73) requires monthly
precipitation to equal at least 0.4 * monthly PET for growing season months.
Additionally, monthly precipitation in excess of monthly PET can accumulate as stored
soil moisture up to the maximum soil moisture capacity (smc) for the site (see data
above). Stored soil moisture (ssm) carried over to the following month was calculated as
𝑠𝑠𝑚𝑡+1 = max{0,min{𝑠𝑚𝑐, 𝑠𝑠𝑚𝑡 + 𝑟𝑎𝑖𝑛𝑓𝑎𝑙𝑙𝑡 − 𝑃𝐸𝑇𝑡}}
Thus the growing season can be extended to months where rainfall + stored soil moisture
is at least 0.4 * monthly PET, i.e. when
𝑠𝑠𝑚𝑡 + 𝑟𝑎𝑖𝑛𝑓𝑎𝑙𝑙𝑡 ≥ 0.4𝑃𝐸𝑇𝑡.
1.3 Phenotype data
1.3.1 Drought experiment in USA
The study was conducted at a site located within the 82 acre Brackenridge Field
Laboratory property of the University of Texas at Austin located in Austin, TX, adjacent
to the Colorado River. Mean maximum temperature (July-August) is ~35.0 °C, and the
mean minimum temperature (December) is ~3.0 °C. Soils are Yazoo sandy loam and are
greater than 1.25 m deep.
The rainout shelter was covered by a clear 240 μm polyethylene roof, which
transmits 90% of photosynthetically active radiation (PAR). The shelter consists of two
laterally connected arched structures with 2.1 m high open sidewalls and 4.2 m high open
ends. The height of the shelter arches maximizes airflow and heat dissipation to maintain
conditions underneath the shelter near ambient.
The planting site contains 34 planting beds each measuring 2.15 x 15.25 m and
oriented perpendicular to the long edge of the shelter. Each planting bed was isolated on
the long edge from neighboring beds by an 8 mm thick corrugated HDPE plastic barrier
that extends from the soil surface down to a depth of 1.25 m. Irrigation was supplied to
each planting bed via three rows of drip irrigation tape with a 10.15 cm emitter spacing
(Chapin Twin Wall BTF Drip Tape, Jain Irrigation Systems Ltd., Jalgaon, India). The
irrigation system is closed and pressure regulated at 10 psi with 2 independently
controllable zones of 17 beds each.
Figure S2. Rainout shelter where plants were grown in Austin.
Genetic Material and Experimental Design: We randomly placed 4 plants of each
accession into 4 of the 17 beds for each treatment, such that no accession was represented
by more than one plant per bed. We then randomly placed plants into beds, with the
conditions that A) no accession had more than one plant located along the short ends of
beds and B) accessions had a total of 1-2 plants per treatment located in the center
columns of their respective beds (remember accessions never have more than one plant in
a bed).
Planting and Irrigation Treatments: Plants were sown over two days (April 18-19, 2013)
in a temperature regulated greenhouse at 24°C into 3.8cm x 3.8 cm x 6 cm pots. Pots
were filled with a rice hull based manufactured soil (Ranch Rose, GeoGrowers, Austin,
TX). Two seeds per pot were sown and emergence was generally complete by April 27.
Pots were thinned to one seedling during May 1-9. Seedlings were watered as needed.
Planting beds at the field site were prepared with a rotary tine tiller to a depth of
20 cm. Woven ground cover (Sunbelt Brand, The DeWitt Company, Sikeston, MO) was
applied to each planting bed for weed control. The ground cover measured 1.85 m wide
and extended the entire length of each planting bed. The outermost 0.15 m of each
planting bed adjacent to the plastic irrigation barrier was left uncovered to prevent any
surface runoff between neighboring beds.
Prior to planting, 25 x 25 cm holes were cut in the groundcover for each planting
location. Each bed contained 3 rows of 32 plants at a 45 cm plant-to-plant distance, with
the center row centered in the bed, resulting in ~29k plants ha-1. Irrigation by broadcast
sprinklers was applied on May 14, 20, and 27 in order to well establish the transplants.
All subsequent irrigation was conducted with drip irrigation. Each row of plants within
each bed was irrigated by a line of drip tape placed adjacent to the base of the plant.
Irrigation was applied at a rate of 5.11 liters per minute per bed and began on June 9 2013
and ended on October 25 2013.
To further characterize our treatment, we sampled soil moisture of the top 20 cm
by gravimetric methods using a 2 cm diameter soil core. Cores were randomly taken from
both well-watered and drought rows over four July sampling dates (n = 120). Cores were
pooled in sets of three, dried, and gravimetric water content was calculated as (mass of
water/bulk mass). We analyzed soil water content by fitting a linear mixed model using
SAS Proc Mixed with irrigation treatment, sampling date, and their interaction as fixed
effects and row as a random effect. This model detected a highly significant treatment ×
Figure S3. Seedlings planted in the Austin experiment.
sampling date interaction (p-value < 0.001), indicating an increasing degree of soil
moisture deficit over the growing season. On average, the irrigation manipulation
resulted in a ~50% reduction in gravimetric soil moisture content of the top 20cM in the
drought relative to the well-watered beds (mean ± SE; well-watered=0.10 ±,
drought=0.045 ± 0.003).
Figure S4. Plants growing in Austin.
Phenotype measurements: Measurements for the approximation of leaf chlorophyll
content were taken with a SPAD 502 meter (Konica Minolta, Tokyo, Japan).
Measurements were taken during mid-day, July 23-25, on the first fully emerged leaf that
was not the flag leaf, on the adaxial side of the leaf midway between the midrib and leaf
margin, halfway down the length of the leaf. Three readings, taken 1 cm apart, were
averaged for the final value. Since the output of the SPAD 502 meter does not reflect a
linear relationship to chlorophyll content over the range of its detection, we transformed
the output values using the monocot consensus equation (CHL[est.] =
(82.2*SPAD)/(135-SPAD)) (79). Tiller number was assessed during the course of the
SPAD measurements.
Prior to harvest, mesh seed bags (Midco Enterprises, Kirkwood, Missouri) were
applied successively to seed heads in the field during the hard dough stage prior to
harvest to reduce grain loss from shattering and pests. Harvest began on September 9 and
continued through November 30. For all accessions with seed set to the hard dough stage
by November 25, seed heads were harvested separately from whole plants. Accessions
that had no seed set to the hard dough stage, or had not yet flowered by November 25
were harvested whole. Plants were harvested at the soil surface and panicles were
harvested with enough stem to allow threshing. AGB was determined by harvesting and
drying plant material in Kraft paper bags (whole plants and panicles separately) at 50°C
for 10 days and then weighing.
We noted strong positional effects on phenotypes within plots, primarily with
respect to plants found at the edge of rainout shelters. In order to correct for these effects,
we built a linear mixed-model where accession phenotypes and phenotypic responses to
drought were fixed effects, plant presence in a row or column at edges of each of the two
shelters were fixed effects (four fixed effects for each long edge of the two shelters, one
fixed effected for short edge of whole plot; data were thrown out from the other short
edge of plot), while block effects were random effects (blocks were three plants, i.e. one
bed width, by four plants). We used the R package ‘lme4’ (80) to fit mixed effects
models. We estimated a fixed effect parameter for each accession's phenotype in each
condition (well-watered or drought). We tested for positive autocorrelation in the
residuals of the model and found no significant autocorrelation (Moran's-I, one-sided test,
all p > 0.25).
1.3.2 Drought experiment in India
The planting was on vertisol Kasireddipally series-isohypothermic Typic Pellustert. The
minimum monthly average temperature during postrainy seasons at Patancheru ranged
from 10.9 to 22.8oC and maximum monthly average temperature from 27.3 to 37.5oC.
The cumulative rainfall was 23.9 mm in 2010-2011 and 41.0 mm during 2011-2012. The
average day length across the experiment was 11.64 hours each year (11.08 hours from
October-December and 12.75 hours from January-April). Plot size consisted of one row
of 4 meter, with an inter- and intra-row spacing of 75 cm and 10 cm, respectively,
resulting in ~133,000 plants ha-1. Ammonium phosphate was applied at the rate of 150 kg
ha-1 as a basal dose, while urea at the rate of 100 kg ha-1 was applied as top dressing three
weeks after planting. Days to 50% flowering was recorded on a plot basis, while
observations on plant height (cm) on five plants were measured at maturity. Panicle
weight (g plant-1) and grain yield (g plant-1) were recorded on 5 representative plants.
2. SNP-environment associations
2.1 Partitioning Sorghum SNP variation
We found that on average 30.5% of SNP variation among African and Eurasian
accessions could be explained by environmental variables (SD = 1.7% across 10,000
resamples) while 28.4% could be explained by spatial variables (SD = 1.7%, Figure S6).
The portion of SNP variation explained by collinear environmental and spatial variables
was large, 23.4% (SD = 1.7%) leaving only 7.1% to be explained by environment
independent of spatial variables (SD = 0.6%) and 5.0% to be explained by spatial
Figure S5. A representative accession (IS 25836) under irrigated (left) and imposed terminal drought (right) conditions at experimental plot in India.
variables independent of environment (SD = 0.5%). Compared to Arabidopsis thaliana
populations in their native Eurasian range, sorghum shows greater environmental (31%
vs. 16%) and spatial (28% vs. 17%) structure in SNP variation (37). The stronger
geographic structure in sorghum versus Arabidopsis may reflect the deliberate movement
of sorghum genotypes to similar climatic zones through human migration and trade
versus the undirected dispersal of Arabidopsis following glacial retreat (81).
Figure S6. Proportion of total SNP variation among accessions with known collection locations (excluding spatial outlier landraces from the Americas, China, and Southeast Asia/Oceania) explained by spatial structure or environmental variables. The box represents all SNP variation among accessions, with white space inside the box showing residuals unexplained by either environmental or spatial variables. Due to rounding, annotated proportions do not sum to 1.
2.2 Enrichment of SNP variation explained by climate in non-synonymous SNPs
We found that environmental variables explained a higher proportion of variation in non-
synonymous SNPs than genome-wide controls (permutation test, z = 6.33, p < 0.002),
even after removing the effects of geographic spatial structure among accessions (z =
7.74, p < 0.002), suggesting that some environmentally structured variants contribute to
adaptation and are not merely neutral (Figure 3C). By contrast, the environmental
structure in intergenic SNPs was not different from genome-wide controls, before (z =
0.34, p = 0.744) and after removing spatial structure (z = -1.60, p = 0.122). Environment
explained less variation than expected in synonymous SNPs (syn.: z = -4.72, p < 0.002)
but after accounting for spatial structure, environment explained a higher than expected
proportion of synonymous SNP variation (z = 8.19, p < 0.002). The enrichment of
synonymous SNPs for environmental structure after accounting for space may be due to
linkage with nearby adaptive non-synonymous SNPs given that LD extends over many
kb (e.g. r2 decays to half the maximum by 3kb on average in our data set).
3. Predicting phenotypes based on SNP association models
3.1 Predictions based on environmental associations
The observations that accessions carrying aridity-associated alleles had higher yield
components (Figures S7 and S9), that landraces from arid environments had higher yield
components (Tables S1 -S1 ), and that there was a positive relationship between
accession biomass and yield in wet versus drought treatments (Tables S2-S3), implies
that there is no trade-off between plant-level yield components and drought adaptation (at
least within the diverse germplasm we studied). The higher biomass and grain weight
associated with alleles from arid environments may reflect selection associated with
lower planting density in arid environments. In arid locations, plants are typically grown
at low density and face low competition from other plants, while in moist environments
plants are grown at high densities likely leading to greater competition among plants. As
a result of negative competition effects, the highest yielding individual genotypes in
3 4
moist environments may not be the highest yielding when grown in monoculture, leading
to farmers to select for reduced competitive effect in high density environments.
−3 −2 −1 0 1 2 3
100
15
020
0
Temp. seasonality score, kinship
Flo
we
rin
g t
ime a
vg.,
we
t &
dry
(d) r = −0.28
●
●
●
●
●
●
●
●
●
●
●●
●
−0
.2−
0.1
0.0
0.1
0.2
r
●
●
●
Env.−associated markers
Markers & kinship combined
Env.−kinship associations
100
SN
Ps
250
SN
Ps
500
SN
Ps
10
00
SN
Ps
50
00
SN
Ps
10
000
SN
Ps
Kin
ship
−8 −6 −4 −2 0 2
20
30
40
50
60
Aridity score, 5k SNPs
Chl con
tent
avg
., w
et
& d
ry (
mg
cm
-2)
r = −0.22
●
●●
●●
●
● ●
●
●
●
●
●
−0
.20
−0
.10
0.0
00
.10
r
10
0 S
NP
s
25
0 S
NP
s
50
0 S
NP
s
10
00 S
NP
s
50
00 S
NP
s
100
00 S
NP
s
Kin
sh
ip
−8 −6 −4 −2 0 2
50
10
02
00
50
0
Aridity score, 5k SNPs
Bio
ma
ss a
vg
., w
et
& d
ry (
g)
r = 0.33
●
●●
●●
●●
●●
●
●
●
●
0.0
0.1
0.2
0.3
0.4
0.5
r
10
0 S
NP
s
25
0 S
NP
s
50
0 S
NP
s
100
0 S
NP
s
500
0 S
NP
s
10
00
0 S
NP
s
Kin
sh
ip
−3 −2 −1 0 1
25
10
20
50
Grow. seas. aridity score, 500 SNPs + kin.
Gra
in a
vera
ge, w
et
& d
ry (
g)
r = 0.30
●
●
●
●
●
●
●
●
●
●
●
●
●
0.1
50.2
00
.25
0.3
0
r
100
SN
Ps
250
SN
Ps
500
SN
Ps
10
00
SN
Ps
50
00
SN
Ps
10
000
SN
Ps
Kin
ship
−4 −3 −2 −1 0 1 2
0.0
20
.05
0.1
00.2
0
Grow. seas. aridity score, 10k SNPs + kin.
Ha
rvest
ind
ex (
we
t /
dry
)
r = 0.22
●
●
●
●
●
●
●
●
●
●
●
●
●
0.1
00
.15
0.2
0
r
10
0 S
NP
s
25
0 S
NP
s
50
0 S
NP
s
10
00 S
NP
s
50
00 S
NP
s
100
00 S
NP
s
Kin
sh
ip
Figure S7. Predictions of phenotypes averaged across well-watered and drought conditions from
drought treatment across growing season in Austin, United States. Predicted (x-axes in left
column of panels) versus observed average phenotypes across treatments (y-axes in same
panels) for breeding lines and landraces (circles in left column of panels), with predictions based
on SNP associations with environmental gradients. Comparison (right column of panels) of
predictions using different numbers of predictor SNPs, including a kinship matrix based on all
SNPs (right column, r = Pearson's correlation coefficient). Note that phenotype data were not
used in predictions. The best prediction is shown for each trait in panels in left column.
Predictions were standardized to z-scores.
−1 0 1 2
0.8
0.9
1.0
1.1
Temp. seasonality score, 100 SNPs + kin.
Flo
we
rin
g t
ime
(w
et /
dry
)
r = 0.14
●
●●
●
●
●
●
●●
●●
●●
0.0
50
.10
0.1
5
r
●
●
●
Env.−associated markers
Markers & kinship combined
Env.−kinship associations
10
0 S
NP
s
25
0 S
NP
s
50
0 S
NP
s
100
0 S
NP
s
500
0 S
NP
s
10
00
0 S
NP
s
Kin
ship
−4 −2 0 2
−2
0−
10
01
02
03
0
Prec. seasonality score, 500 SNPs
Ch
l co
nte
nt
(we
t −
dry
, m
g c
m-
2)
r = 0.12
●
●
●
●
●
●
●
●
●●
●
●
●
−0
.02
0.0
20
.06
0.1
0
r
100
SN
Ps
250
SN
Ps
500
SN
Ps
10
00
SN
Ps
50
00
SN
Ps
10
000
SN
Ps
Kin
sh
ip
−6 −4 −2 0 2
0.5
1.0
2.0
5.0
Aridity score, 10k SNPs
Bio
ma
ss (
we
t /
dry
)
r = 0.10
●
●●●●
●
●
●
●
●
●
●
●
−0
.10
0.0
00
.10
0.2
0
r
10
0 S
NP
s
25
0 S
NP
s
50
0 S
NP
s
10
00 S
NP
s
50
00 S
NP
s
10
00
0 S
NP
s
Kin
sh
ip
−4 −2 0 2
0.1
0.5
2.0
5.0
20
.0
Prec. warmest q. score, 10k SNPs
Gra
in (
we
t / d
ry)
r = 0.12
●●●
●● ●
●
●●
●
●
●
●
0.0
20
.04
0.0
60.0
80
.10
0.1
2
r
10
0 S
NP
s
25
0 S
NP
s
50
0 S
NP
s
10
00
SN
Ps
50
00
SN
Ps
100
00
SN
Ps
Kin
sh
ip
Figure S8. Predictions of phenotype change between well-watered and drought conditions from
drought treatment across growing season in Austin, United States. Predicted (x-axes in left
column of panels) versus observed change in phenotypes across treatments (y-axes in same
panels) for breeding lines and landraces (circles in left column of panels), with predictions based
on SNP associations with environmental gradients. Comparison (right column of panels) of
predictions using different numbers of predictor SNPs, including a kinship matrix based on all
SNPs (right column, r = Pearson's correlation coefficient). Note that phenotype data were not
used in predictions. The best prediction is shown for each trait in panels in left column.
Predictions were standardized to z-scores.
−2 −1 0 1 2 3
10
015
020
025
030
03
50
Grow. seas. length score, 250 SNPs + kin.
He
ight
avg
., w
et
& d
ry (
cm
)r = 0.19
●
●
●
●●
●
●
●
●
●
● ●
●
0.0
00.1
00.2
0
r
10
0 S
NP
s
25
0 S
NP
s
50
0 S
NP
s
10
00 S
NP
s
50
00 S
NP
s
10
00
0 S
NP
s
Kin
sh
ip
●
●
●
Env.−associated markersMarkers & kinship combinedEnv.−kinship associations
−3 −2 −1 0 1 2
50
60
70
80
90
110
Temp. seasonality score, 250 SNPs + kin.
Flo
we
rin
g tim
e a
vg
., w
et
& d
ry (
days)
r = −0.41
●
●
● ●
●
●
●
●
●
●
●
●
●
−0.4
−0
.3−
0.2
−0
.1
r
10
0 S
NP
s
25
0 S
NP
s
50
0 S
NP
s
10
00 S
NP
s
50
00 S
NP
s
10
00
0 S
NP
s
Kin
sh
ip
−4 −3 −2 −1 0 1 2
35
40
45
50
55
Grow. seas. aridity score, 100 SNPs + kin.
Sp
ad
avg
., w
et
& d
ry
r = 0.30●
●
●
●
●
●
●
●
●
●
●
●
●
0.1
50
.20
0.2
50
.30
r
10
0 S
NP
s
25
0 S
NP
s
50
0 S
NP
s
10
00 S
NP
s
50
00 S
NP
s
10
00
0 S
NP
s
Kin
sh
ip
−5 −4 −3 −2 −1 0 1
010
20
30
40
Grow. seas. aridity score, 10k SNPs + kin.
Gra
in a
vg.,
wet
& d
ry (
g)
r = 0.31
●
●
●
●
●
●
●
●
●●
●
●
●
0.1
00.1
50.2
00
.25
0.3
0
r
10
0 S
NP
s
25
0 S
NP
s
50
0 S
NP
s
10
00 S
NP
s
50
00 S
NP
s
10
00
0 S
NP
s
Kin
sh
ip
−5 −4 −3 −2 −1 0 1
01
020
30
40
50
60
Grow. seas. aridity score, 10k SNPs + kin.
Pan
icle
avg.,
wet
& d
ry (
g)
r = 0.28
●
●
●
●
●
●
●
●
●
●
●
●
●
0.1
00
.15
0.2
00
.25
r
10
0 S
NP
s
25
0 S
NP
s
50
0 S
NP
s
10
00 S
NP
s
50
00 S
NP
s
10
00
0 S
NP
s
Kin
sh
ip
Figure S9. Predictions of phenotypes averaged across well-watered and drought conditions from
drought treatment late in growing season in Hyderabad, India. Predicted (x-axes in left column of
panels) versus observed average phenotypes across treatments (y-axes in same panels) for
breeding lines and landraces (circles in left column of panels), with predictions based on SNP
associations with environmental gradients. Comparison (right column of panels) of predictions
using different numbers of predictor SNPs, including a kinship matrix based on all SNPs (right
column, r = Pearson's correlation coefficient). Note that phenotype data were not used in
predictions. The best prediction is shown for each trait in panels in left column. Predictions were
standardized to z-scores. Note that relationships between predicted and observed were even
stronger after removing the outliers for low predicted growing season aridity (bottom two left
panels).
−4 −3 −2 −1 0 1 2
020
40
60
Grow. seas. aridity score, 500 SNPs
He
ight
(we
t −
dry
, cm
)r = −0.12
●
●●
●
●
●
●
●●
●● ●
●
−0.1
00.0
00
.05
0.1
0
r
10
0 S
NP
s
25
0 S
NP
s
50
0 S
NP
s
10
00 S
NP
s
50
00 S
NP
s
10
00
0 S
NP
s
Kin
sh
ip
●
●
●
Env.−associated markersMarkers & kinship combinedEnv.−kinship associations
−4 −3 −2 −1 0 1 2
−5
05
Temp. seasonality score, kin.
Flo
we
rin
g tim
e (
we
t −
dry
, d
ays) r = 0.14
●
●
●
●
●
●
●
●
●
●
● ●
●
−0
.05
0.0
00
.05
0.1
0
r
10
0 S
NP
s
25
0 S
NP
s
50
0 S
NP
s
10
00 S
NP
s
50
00 S
NP
s
10
00
0 S
NP
s
Kin
sh
ip
−6 −4 −2 0 2
−5
05
10
Aridity score, 1000 SNPs + kin.
Sp
ad
(w
et
− d
ry)
r = 0.16●
●
●
●
●●
●
●
●
●
●
●
●
0.0
00
.05
0.1
00
.15
r
10
0 S
NP
s
25
0 S
NP
s
50
0 S
NP
s
10
00
SN
Ps
50
00
SN
Ps
10
00
0 S
NP
s
Kin
sh
ip
−6 −4 −2 0 2
−5
05
10
15
20
Prec. warmest q. score, 250 SNPs + kin.
Gra
in (
wet
− d
ry, g
)
r = 0.17
●
●
●
●
●
●
●
●
●
●
●
●
●
0.0
80
.10
0.1
20
.14
0.1
6
r
10
0 S
NP
s
25
0 S
NP
s
50
0 S
NP
s
10
00 S
NP
s
50
00 S
NP
s
10
00
0 S
NP
s
Kin
sh
ip
Figure S10. Predictions of phenotype change between well-watered and drought conditions from
drought treatment late in growing season in Hyderabad, India. Predicted (x-axes in left column of
panels) versus observed change in phenotypes across treatments (y-axes in same panels) for
breeding lines and landraces (circles in left column of panels), with predictions based on SNP
associations with environmental gradients. Comparison (right column of panels) of predictions
using different numbers of predictor SNPs, including a kinship matrix based on all SNPs (right
column, r = Pearson's correlation coefficient). Note that phenotype data were not used in
predictions. The best prediction is shown for each trait in panels in left column. Predictions were
standardized to z-scores.
3.2 Comparison with predictions based on phenotype associations
For comparison with predictions based on SNP-environment associations, we also
developed predictions based on SNP-phenotype associations, i.e. the traditional
implementation of MAS and GS. We tested this approach on biomass, grain weight,
panicle weight, harvest index, and relative net root growth (i.e. phenotypes discussed in
the main text). These models were implemented using the same methods as described
above (section 3.1) for predictions based on SNP-environment associations, with two
exceptions. First, association models (EMMA and gBLUP) were run with phenotypes as
responses. Second, because these phenotypic predictions were for the same accessions on
which the model was fit, we conducted 5-fold cross validation, leaving out one fifth of
accessions (chosen randomly) for each fold. Phenotypic predictions for each accession
were thus based on associations for different accessions.
In the Austin drought experiment phenotype associations with kinship (the best
model) were reasonable predictors of grain weight (r = 0.40), harvest index averaged
across treatments (r = 0.36; Table S1 ). Biomass association with kinship (the best
model) was a very good predictor of biomass (r = 0.80). However, changes in grain and
harvest index were predicted very poorly by both kinship and SNPs (best grain model:
top 5,000 SNPs, r = -0.02; best harvest index model: top 5,000 SNPs, r = 0.07). Change
in biomass was best predicted by combined predictions of kinship and the top 1,000
SNPs (r = 0.20).
In the Hyderabad drought experiment, phenotype associations with kinship (the
best model) were good predictors of grain weight (r = 0.65) and panicle weight (0.64)
averaged across treatments (Table S1 ). By contrast, the changes in grain and panicle
weight across treatments were not predicted as well (best grain model: top 500 SNPs +
kinship, r = 0.22; best panicle model: top 1,000 SNPs + kinship, r = 0.10).
The best predictor of root growth response to aluminum soil toxicity was the
kinship association with this trait (r = 0.60, Table S1 ).
3. 3 Tests for enrichment of environment–associated SNPs in candidate loci under
selection
We tested for enrichment of regions identified by Mace et al. (46) with SNPs in
the 0.05 lower tail of p-values for all SNPs genome-wide. We focused on associations
with the three environment variables that predicted genotype by environment interactions
(Figure 4). Our observed test statistic was the 0.05 quantile for EMMA (mixed-model)
association p-values of SNPs found within the candidate regions of (46). We generated a
6
7
8
null expectation by circularly permuting SNP classifications as within or outside of their
candidate regions, and then calculating the 0.05 quantile for p-values of SNPs within the
permuted regions. We conducted 10,000 permutations of these classifications and used
the observed tail density * 2 as an empirical two-tailed p-value.
We also tested whether these regions were enriched in multivariate environmental
structure. In these candidate regions we found significant enrichment for environmental
structure independent of spatial structure among landraces (p < 0.02). Surprisingly, these
candidate regions had significantly less spatially-structured climate variation than
expected (p < 0.02).
Figure S11. GWAS for harvest index plasticity (harvest index in wet/dry) in U.S. experiment,
using SNP associations with precipitation in the warmest quarter as a prior. The top panel shows
results from traditional mixed-model association (EMMA) (31) while the bottom panel shows the
approximate posterior probability of association (APPA), which uses an informative prior for each
SNP based on associations with environment in the landrace panel.
Figure S12. GWAS for panicle weight plasticity (panicle weight in wet – dry) in India experiment,
using SNP associations with growing season length as a prior. The top panel shows results from
traditional mixed-model association (EMMA) (31) while the bottom panel shows the approximate
posterior probability of association (APPA), which uses an informative prior for each SNP based
on associations with environment in the landrace panel.
Figure S13. GWAS for root growth plasticity (growth in control/Al toxic) in published aluminum
toxicity experiment (44), using SNP associations with topsoil pH as a prior. The top panel shows
results from traditional mixed-model association (EMMA) (31) while the bottom panel shows the
approximate posterior probability of association (APPA), which uses an informative prior for each
SNP based on associations with environment in the landrace panel.