12
1 Supplementary Material for: Promises and challenges of eco-physiological genomics in the field: tests of drought responses in switchgrass. Lovell et al. Table of Contents: 1. Supplementary Methods: a. Additional physiological methods b. Additional RNA extraction methods c. Additional bioinformatics methods d. Additional statistical methods 2. Appendices: a. Appendix 1: Model specifications 3. Supplementary Tables: a. Table S1: Significance and odds ratios of significantly differentially expressed gene overlaps b. Table S2: Gene lists and annotations of genes differentially expressed in 3 or 4 of the experiments with homologs related to drought (see separate file). c. Table S3: GO enrichment across experiments (see separate file). d. Table S4: Climatic conditions during harvest e. Table S5: Accession numbers for TAG-seq reads. 4. Supplementary Figures: a. Figure S1. The number of significant genes detected per fixed replication level. b. Figure S2: The differential impact of diurnal patterns on gene expression in the wet and dry treatments of the shelter experiment. c. Figure S3: The physiological and gene expression effects of the order of sample collection. d. Figure S4: Pairwise expression correlations. e. Figure S5. Conserved expression across all experiments. f. Figure S6. Rarefaction analysis of library sequencing depth. 5. Supplementary References SUPPLEMENTARY METHODS 1.1 Additional physiological methods: In the greenhouse and shelter experiments, ambient midday PPFD was ~1500 µmol m -2 s -1 , and the actinic PAR within the cuvette was maintained at 1500 µmol m -2 s -1 . Ambient midday PPFD was slightly higher in the cylinder experiment and was set at 1700 µmol m -2 s -1 . Chamber supply [CO 2 ] was controlled at 400 μmol mol -1 in the cylinder and shelter experiments, and slightly lower (380 μmol mol-1) in the glasshouse experiment, matching the ambient [CO 2 ] in the glasshouse, as measured with the LI-6400 infrared gas analyser. In all experiments, cuvette block temperature was set at ambient air temperature and water vapour of the incoming air was not scrubbed so that the chamber reference RH conditions were similar to atmospheric RH conditions. Leaf photosynthesis

Promises and challenges of eco-physiological genomics in ...May 31, 2016  · Supplementary Material for: ... decomposed granite and gypsum (Ranch Rose Mix, Geo Growers, Austin, TX)

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Promises and challenges of eco-physiological genomics in ...May 31, 2016  · Supplementary Material for: ... decomposed granite and gypsum (Ranch Rose Mix, Geo Growers, Austin, TX)

1

Supplementary Material for:

Promises and challenges of eco-physiological genomics in the field: tests of drought responses in switchgrass. Lovell et al.

Table of Contents:

1. Supplementary Methods: a. Additional physiological methods b. Additional RNA extraction methods c. Additional bioinformatics methods d. Additional statistical methods

2. Appendices:

a. Appendix 1: Model specifications

3. Supplementary Tables: a. Table S1: Significance and odds ratios of significantly differentially expressed

gene overlaps b. Table S2: Gene lists and annotations of genes differentially expressed in 3 or 4 of

the experiments with homologs related to drought (see separate file). c. Table S3: GO enrichment across experiments (see separate file). d. Table S4: Climatic conditions during harvest e. Table S5: Accession numbers for TAG-seq reads.

4. Supplementary Figures:

a. Figure S1. The number of significant genes detected per fixed replication level. b. Figure S2: The differential impact of diurnal patterns on gene expression in the

wet and dry treatments of the shelter experiment. c. Figure S3: The physiological and gene expression effects of the order of sample

collection. d. Figure S4: Pairwise expression correlations. e. Figure S5. Conserved expression across all experiments. f. Figure S6. Rarefaction analysis of library sequencing depth.

5. Supplementary References

SUPPLEMENTARY METHODS

1.1 Additional physiological methods: In the greenhouse and shelter experiments, ambient midday PPFD was ~1500 µmol m-2s-1, and the actinic PAR within the cuvette was maintained at 1500 µmol m-2s-1. Ambient midday PPFD was slightly higher in the cylinder experiment and was set at 1700 µmol m-2s-1. Chamber supply [CO2] was controlled at 400 µmol mol-1 in the cylinder and shelter experiments, and slightly lower (380 µmol mol-1) in the glasshouse experiment, matching the ambient [CO2] in the glasshouse, as measured with the LI-6400 infrared gas analyser. In all experiments, cuvette block temperature was set at ambient air temperature and water vapour of the incoming air was not scrubbed so that the chamber reference RH conditions were similar to atmospheric RH conditions.

Leaf photosynthesis

Page 2: Promises and challenges of eco-physiological genomics in ...May 31, 2016  · Supplementary Material for: ... decomposed granite and gypsum (Ranch Rose Mix, Geo Growers, Austin, TX)

2

Leaf photosynthetic parameters were determined using two LI-6400 XT photosynthesis systems (LI-COR Inc, Lincoln, Nebraska). Photosynthesis systems were equipped with chlorophyll fluorometers (6400-40) integrated into the cuvette lid of an open gas exchange system and were equipped with CO2 mixers (6400-01). Cuvette conditions were set to a PPFD of 1700 µmol m−2 s−1, a CO2 concentration of 400 ± 2 µmol mol−1 (mean ± s.d.) and a leaf vapour pressure deficit of 3.9 ± 0.34 kPa. For light adapted leaves, immediately following cuvette stability (approx 2-3 mins after the cuvette was sealed), measurements were taken to determine net CO2 assimilation rate, stomatal conductance to H2O and light use efficiency of photochemistry (ΦPSII), as well as the components of the latter; the light use efficiency of open reaction centres (Fv’/Fm’) and photochemical quenching (qP). Values were determined using the standard models built into the LI-6400 operating system.

Growth Conditions

They cylinders were filled to a depth of at least 1 m with a mixture of dairy manure compost, composted rice hulls, decomposed granite and gypsum (Ranch Rose Mix, Geo Growers, Austin, TX). The base of each cylinder was sealed with a plastic end cap (M&P Pipe & Flange Protection Inc., Houston, TX, USA) into which were drilled 10 mm diameter drainage holes spaced 10 × 10 cm apart. Prior to filling, a thin layer (10 cm) of gravel was added to each cylinder to improve drainage.

Across experiments, sampling occurred on 6/13-6/14 2012, 8/5-8/6 (Temple) and 8/8-8/9 (Austin) 2013, and 8/7-8/8 (Temple) and 8/11-8/12 (Austin) 2014.

1.2 Additional RNA extraction and TAG-seq methods: Switchgrass RNA-Seq library samples were prepared using a modified version of the TAG-seq protocol with the following minor modifications. Briefly, P. virgatum leaf samples (50-200 mg) were homogenized in eppendorf tubes with steel beads on a Geno/Grinder 2000 (Spex SamplePrep). RNA was extracted with the standard Trizol protocol and treated with DNase I to remove contaminating genomic DNA. RNA concentration was quantified with Qubit (Invitrogen). 1 µg of DNAse I treated total RNA was incubated in PCR tubes at 95°C for 10 minutes to fragment RNA to the desired size range (350 – 500 bp). First strand cDNA was synthesized using the SuperScript II Reverse Transcriptase (Invitrogen) using the entire RNA sample in 20 ul reactions. Amplification of DNA libraries was accomplished using 15 cycles of PCR and 12 ul of first strand cDNA as the template. PCR products were purified using NucleoFast PCR Clean-up plates (Macherey-Nagel). Individual DNA libraries were then re-amplified with unique barcoded primers using 4 PCR cycles and 50 ng of purified cDNA as the template. After quality control check, 32 individually barcoded RNA-seq libraries were combined in a single tube, re-purified over NucleoFast 96 PCR plate, run on 1.5% agarose gel and DNA products in the 300-500 bp range were excised with a clean razor blade. PCR products were re-extracted from the gel using PureLink Quick Gel extraction and PCR purification combo kit (Invitrogen). RNA-seq libraries were then submitted to the Genomic Sequencing and Analysis Facility (UT Austin) for DNA sequencing aiming to obtain 5 million reads per sample.

1.3 Additional bioinformatics methods: Illumina based sequencing samples (shelter experiments and cylinder experiments) were delivered demultiplexed on a per run basis and each fastq file was processed individually. The employed 3' tag protocol selects molecules using a poly-A capture and utilizes a known inline adapter on the 5' end of the molecule. We therefore trimmed poly-A tails with cutadapt (Martin 2011) and then removed the 5' adapter sequence using fastx_trimmer (http://hannonlab.cshl.edu/fastx_toolkit/) where appropriate followed by second pass of cutadapt. The first pass of cutadapt searched for a 15mer poly-A sequence of length at least one base pair and allowed an error rate of 20% (-a "AAAAAAAAAAAAAAA" -O 1 -e .2). The fastx_trimmer removed the first five bases (-f 5). The second pass of cutadapt

Page 3: Promises and challenges of eco-physiological genomics in ...May 31, 2016  · Supplementary Material for: ... decomposed granite and gypsum (Ranch Rose Mix, Geo Growers, Austin, TX)

3

searched for a 10mer poly-G sequence of length at least one base pair and allowed an error rate of 20% and only sequences of length greater than 75 base pairs were retained (-g "GGGGGGGGGG" -O 1 -e .2 -m 75). The trimmed sequences were subsequently aligned to the P. virgatum V2.0 reference using BWA-mem with default parameters (Li and Durbin 2009). Hits to “gene” annotations referenced by the “ID” tag in the P. virgatum V2.1 (which was provided with P. virgatum V2.0 reference release) were assessed with htseq-count (Anders et al. 2014) using the stranded union mode (--mode=union --stranded=yes --type=gene --idattr=ID). Counts for sample libraries with multiple sequencing files (e.g., sequenced on multiple runs) were aggregated together via simple addition producing a single count for each sample. For visual inspection bedGraph files accounting for spliced alignments were made with bedtools (Quinlan and Hall 2010) using genomecov (-bg -split) and then converted to BigWig format using bedGraphToBigWig (Kent et al. 2010). A thorough visual evaluation of the suggested very good performance of the 3'-tab protocol as well as a complete and accurate annotation. Color space reads (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE57887) were processed from sra to csfasta/QV.qual using abi-dump (http://www.ncbi.nlm.nih.gov/books/NBK158900/). Under the performed library protocol the first four sequenced bases are adapter and were removed along with poly-A tails using cutadapt. Again, a 15mer poly-A tail with at least a one base pair and a 20% error rate was allowed, but this was done in color space and only trimmed sequences greater than 30 base pairs were kept (-u 4 -a "AAAAAAAAAAAAAAA" -c -t -O 1 -e .2 -m 30). The color space reads were then aligned with bowtie (Langmead 2009) using direct color space alignment allowing at most 3 mismatches and retaining at most alignments (-C -S -v 3 -k 4). Gene counts were calculated as above with the version 2 resources using htseq-count with NH flags added into the sam file in order to allow htseq-count to correctly address the multi-mapping reads (i.e., only uniquely aligning reads were retained).

1.4 Additional statistical methods: All statistical analyses and counts preparations were conducted in R. Raw count matrices were culled to genes with means of >5 raw counts. Libraries sizes were inferred with the “calcNormFactors” function of edgeR. Significantly small libraries were qualified as those with a left skewed outlier with the “getOutliers” function from the extremeValues R package. This approach uses the relationship of extreme values to those within the 10-90th percentiles of the data. We normalized expression via the voom function in LIMMA, then conducted a principal component analysis of the transposed counts matrix to define whether there was the possibility of contamination. Library positions were visually inspected and extreme outliers were removed. Row-wise statistical models (Appendix 1) were applied to the counts and paired experimental design data matrices. Statistical models were specified either as a formula or as a matrix of contrasts through a custom LIMMA pipeline.

Analysis of soil moisture release curves was accomplished with nls (non-linear least squares) model (Ross et al. 1991). Predicted values of soil moisture water potential were calculated from this model for each measurement.

APPENDICES

Appendix #1: Model specifications

Here, we present the formulas, contrasts and blocking variables for each model used to test statistical significance. Such models are outlined below with a short description of the reasoning of model choice and the parameters supplied. Models are specified here with a formula. For information about how the formula is applied to the statistical model, see the LIMMA documentation.

Page 4: Promises and challenges of eco-physiological genomics in ...May 31, 2016  · Supplementary Material for: ... decomposed granite and gypsum (Ranch Rose Mix, Geo Growers, Austin, TX)

4

2012 Shelter models: Model Justification Blocking Variable Model Formula / Contrasts main effect of treatment Sub_Block Y ~ Treatment contrasts among all treatment levels

Sub_Block contrast.matrix = 25th-low, mean-low, ambient-low, 75th-low, high-low, mean-25th, ambient-25th, high-25th, ambient-mean, 75th-mean, high-mean, 75th-ambient, high-ambient, high-75th

*main effect of Yleaf Sub_Block Y ~ Ψleaf *order as a main effect NA Y ~ order * Here design and counts are subset to only those genotypes with physiology data.

2013-14 Shelter models: Model Justification Blocking Variable Model Formula / Contrasts main effects of treatment, year and location

Sub_Block Y ~ Treatment * Location + Year

contrasts among all treatment levels

Sub_Block contrast.matrix = TMP_13_high - TMP_13_low, WFC_13_high - WFC_13_low, TMP_14_high - TMP_14_low, WFC_14_high - WFC_14_low

main effect of Yleaf Sub_Block Y ~ Ψleaf + Location + Year order as a main effect NA Y ~ order + Location + Year +effect of treatment within each location

Sub_Block Y ~ Treatment + Year

+effect of Yleaf within each location

Sub_Block Y ~ Yleaf + Year

+ This model was fit independently in each location.

2011 Cylinder models. The dusk sampling (presented in Fig. 1c-d) is dropped from these analyses so that models are comparable between the greenhouse and cylinder experiments. Model Justification Blocking Variable Model Formula / Contrasts main effects of treatment, year and location

Unique Plant Y ~ Treatment * Sampling Time

contrasts among all treatment levels

Unique Plant contrast.matrix = dawn(wet) – dawn(dry), midday(wet)-midday(dry)

main effect of Yleaf Unique Plant Y ~ Ψleaf $effect of treatment within each time of sampling

Unique Plant Y ~ Treatment

$effect of Yleaf within each location

Unique Plant Y ~ Yleaf + Treatment

2012 Greenhouse models. Model Justification Blocking Variable Model Formula / Contrasts main effects of treatment, year and location

Unique Plant Y ~ Treatment * Sampling Time + Day

contrasts among all treatment levels

Unique Plant contrast.matrix = dawn(wet) – dawn(dry), midday(wet)-midday(dry)

main effect of Yleaf Unique Plant Y ~ Ψleaf $effect of treatment within each time of sampling

Unique Plant Y ~ Treatment

$effect of Yleaf within each location

Unique Plant Y ~ Yleaf + Treatment

SUPPLEMENTARY TABLES

Table S1: Significance and odds ratios of significantly differentially expressed gene overlaps. Odds ratios are colored from most overlap (red) to least (blue) and are presented below the diagonal. P-values are presented above the diagonal.

Page 5: Promises and challenges of eco-physiological genomics in ...May 31, 2016  · Supplementary Material for: ... decomposed granite and gypsum (Ranch Rose Mix, Geo Growers, Austin, TX)

5

Shelter (2012) Shelter (2013-14) Cylinder Greenhouse

Shelter (2012)

9.10E-52 2.30E-65 1.32E-37 Shelter (2013-14) 7.14

8.11E-82 2.16E-46

Cylinder 4.18 5.23

8.89E-278 Greenhouse 2.91 3.41 3.75

Table S2: Gene lists and annotations of genes differentially expressed in 3 or 4 of the experiments with homologs related to drought (see separate file).

Table S3: GO enrichment across experiments (see separate file).

Table S4. Atmospheric conditions at the two shelter experiment sites. The 2012 sampling was conducted in June at Temple. The 2013-2014 samplings were conducted in August.

Temple

Austin

Conditions during physiological measurements June August

June August

Cumulative precipitation (mm) Precipitation = High 491 604

409 520

Precipitation = Mean 262 375 Precipitation = Low 145 195

126 148

Air temperature (°C) 30.3 ± 2.6 36.1 ± 2.6

31.1 ± 2.1 35.7 ± 3.1 Relative humidity (%) 66.6 ± 10.2 43.7 ± 7.8

61.2 ± 10.1 41.8 ± 10.6

Vapor pressure deficit (kPa) 1.5 ± 0.6 3.4 ± 0.9

1.8 ± 0.6 3.5 ± 1.1 Volumetric water content (m3 m-3)

Precipitation = High 0.36 ± 0.01 0.24 ± 0.02

0.19 ± 0.06 0.10 ± 0.04 Precipitation = Mean 0.22 ± 0.01 0.21 ± 0.01

Precipitation = Low 0.19 ± 0.02 0.13 ± 0.02

0.09 ± 0.03 0.07 ± 0.02

Page 6: Promises and challenges of eco-physiological genomics in ...May 31, 2016  · Supplementary Material for: ... decomposed granite and gypsum (Ranch Rose Mix, Geo Growers, Austin, TX)

6

Table S5: Accession numbers for TAG-seq reads. All raw sequencing data has been submitted in the short read archive under BioProject ID PRJNA322529. Here, the experiment ID, treatment, date of sequencing, date of submission and accession number are presented for all sequenced individuals. (See separate file).

SUPPLEMENTARY FIGURES

Figure S1. The number of significant genes detected per fixed replication level. For each experiment, the number of individual libraries was rarefied to n = 20, 22, …, 40 and randomly sampled 10x within each rarefaction level. At each sampling, a test for differential expression was accomplished and the number of genes with FDR-corrected P-values ≤ 0.05 were counted (“n significant genes”). A loess model was fit to each experiment and the resultant curves (± standard error) are plotted.

Page 7: Promises and challenges of eco-physiological genomics in ...May 31, 2016  · Supplementary Material for: ... decomposed granite and gypsum (Ranch Rose Mix, Geo Growers, Austin, TX)

7

Figure S2: The differential impact of diurnal patterns on gene expression in the wet and dry treatments of the shelter experiment. Log2 fold changes between predawn and midday sampling for each treatment are plotted for all genes with significant differential expression related to diurnal patterns. Each point represents a single gene and is colored by whether the absolute log2 fold change between predawn and midday was greater in the drought treatment (red) or control (blue).

Page 8: Promises and challenges of eco-physiological genomics in ...May 31, 2016  · Supplementary Material for: ... decomposed granite and gypsum (Ranch Rose Mix, Geo Growers, Austin, TX)

8

Figure S3: The physiological and gene expression effects of the order of sample collection. Midday Ψleaf for each sampled plant is plotted by the order in which it was sampled. Linear model fits are plotted for each treatment (blue = wet, red = dry, green = mean). The effect of sampling order on standardized, voom-normalized expression counts are plotted for 20 of the 21 significant genes in 2012. Loess spline curves +/- SE accompany the plots.

Page 9: Promises and challenges of eco-physiological genomics in ...May 31, 2016  · Supplementary Material for: ... decomposed granite and gypsum (Ranch Rose Mix, Geo Growers, Austin, TX)

9

Figure S4: Pairwise expression correlations between time of day and experiments. Log2 fold changes attributable to treatment were extracted for all fitted models and culled to genes that were differentially expression in all experiments. Correlation coefficients (r) are printed above the diagonal. In lieu of points for each gene, density bins are plotted below the diagonal.

y2012

y201314.temple

y201314.austin

greenhouse.midday

cylinder.m

idday

greenhouse.predawn

cylinder.preawn

y2012 y201314.temple y201314.austin greenhouse.midday cylinder.midday greenhouse.predawn cylinder.preawn

0.0

0.2

0.4

0.6

0.8

0.539 0.138 0.597 0.673 0.554 0.486

ï�

ï�

0

3

0.165 0.527 0.585 0.487 0.42

ï�

ï�

0

1

2

0.135 0.12 0.136 0.0749

ï�

0

3

0.733 0.782 0.528

ï�

0

5

0.627 0.521

ï�

0

3

6

0.632

��

��

0.0

2.5

ï��� ï��� 0.0 2.5 ï� ï� 0 3 ï� ï� 0 1 2 ï� 0 3 6 ï� 0 5 ï� 0 3 ï��� ï��� 0.0 2.5

Page 10: Promises and challenges of eco-physiological genomics in ...May 31, 2016  · Supplementary Material for: ... decomposed granite and gypsum (Ranch Rose Mix, Geo Growers, Austin, TX)

10

Figure S5. Conserved expression across all experiments. The number of genes differentially expressed (α = 0.05) across drought treatments was tallied in a Venn diagram (A). The highlighted 84 genes in the center represent a core set that are differentially expressed across all experiments. To visualize the consistency of this effect, α was relaxed to 0.1 and a quadrilateral, with vertices representing the log2 fold change for each experiment was plotted for each significant gene (B). The plot is bounded by a maximum log2 fold change of 5, and the red dashed line indicates a log2 fold change of 0.

Page 11: Promises and challenges of eco-physiological genomics in ...May 31, 2016  · Supplementary Material for: ... decomposed granite and gypsum (Ranch Rose Mix, Geo Growers, Austin, TX)

11

Figure S6. Rarefaction analysis of library sequencing depth. For each experiment (red = cylinder, green = greenhouse, blue = 2012 shelter, purple = 2013-14 shelter), the number of genes with >5 raw counts and the total library size are plotted (top). The number of genes detected for each experiments given 0-200% change in sequencing depth is plotted in the bottom panel.

●●

●●●

● ●

●●

●●●●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●●

●●

●●

●●●

●●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

●●●

●●●

●●

●●

●●

●●

●●●

●●●●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

cylinder greenhouse

y2012 y201314

10000

20000

30000

10000

20000

30000

0e+00 2e+06 4e+06 6e+06 0e+00 2e+06 4e+06 6e+06% of observed library size

n. g

enes

det

ecte

d (>

5 co

unts

)

individual library detection

●●

●●

●●

●●

●●

●● ●

●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

●●

●●

●●

●●

●●

●●

●●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

●●

●●

●●

●●

●●

● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

●●

●●

●●

●●

●● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

10000

20000

30000

50 100 150 200% of observed library size

n. g

enes

det

ecte

d (>

5 co

unts

)

experiment−wide detection

Page 12: Promises and challenges of eco-physiological genomics in ...May 31, 2016  · Supplementary Material for: ... decomposed granite and gypsum (Ranch Rose Mix, Geo Growers, Austin, TX)

12

SUPPLEMENTARY REFERENCES

Kent WJ, Zweig AS, Barber G, Hinrichs AS, Karolchik D. 2010. BigWig and BigBed: enabling browsing of large distributed datasets. Bioinformatics 26:2204–2207.

Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760.

Martin M. 2011. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal 17:pp.10–pp.12.

Quinlan AR, Hall IM. 2010. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842.

Ross PJ, Williams J, Bristow KL. 1991. Equation for Extending Water-Retention Curves to Dryness. Soil Science Society of America Journal 55:923–927.