View
2
Download
0
Category
Preview:
Citation preview
Article
Cross-Study Comparison Reveals Common Genomic Networkand Functional Signatures of Desiccation Resistance inDrosophila melanogaster
Marina Telonis-Scott1 Carla M Sgro1 Ary A Hoffmann2 and Philippa C Griffin2
1School of Biological Sciences Monash University Clayton Melbourne VIC Australia2School of BioSciences Bio21 Institute University of Melbourne Parkville Melbourne VIC Australia
Corresponding author E-mail marinatelonisscottmonashedu
Associate editor Ilya Ruvinsky
Abstract
Repeated attempts to map the genomic basis of complex traits often yield different outcomes because of the influence ofgenetic background gene-by-environment interactions andor statistical limitations However where repeatability is lowat the level of individual genes overlap often occurs in gene ontology categories genetic pathways and interactionnetworks Here we report on the genomic overlap for natural desiccation resistance from a Pool-genome-wide associationstudy experiment and a selection experiment in flies collected from the same region in southeastern Australia in differentyears We identified over 600 single nucleotide polymorphisms associated with desiccation resistance in flies derived fromalmost 1000 wild-caught genotypes a similar number of loci to that observed in our previous genomic study of selectedlines demonstrating the genetic complexity of this ecologically important trait By harnessing the power of cross-studycomparison we narrowed the candidates from almost 400 genes in each study to a core set of 45 genes enriched forstimulus stress and defense responses In addition to gene-level overlap there was higher order congruence at thenetwork and functional levels suggesting genetic redundancy in key stress sensing stress response immunity signalingand gene expression pathways We also identified variants linked to different molecular aspects of desiccation physiologypreviously verified from functional experiments Our approach provides insight into the genomic basis of a complex andecologically important trait and predicts candidate genetic pathways to explore in multiple genetic backgrounds andrelated species within a functional framework
Key words desiccation Drosophila GWAS gene overlap
IntroductionIn climate change research there is increasing interest to con-sider not only the obvious impact of changing temperatureson biodiversity but also fluctuations in rainfall and humidity(Bonebrake and Mastrandrea 2010 Clusella-Trullas et al 2011Chown 2012) Changes in water availability pose specific chal-lenges to terrestrial ectotherms such as insects impactingactivity range distributions species richness and diseasevector populations (reviewed in Chown et al 2011 and ref-erences therein) Efforts to understand the responses of in-sects to water availability are underway given thatphysiological responses are an integral component of predict-ing species responses to climate change (Chown et al 2011Hoffmann and Sgro 2011) Insect water balance physiology isrelatively well elucidated (Hadley 1994 Chown and Nicolson2004 Bradley 2009) and the field is undergoing a crucial shifttoward understanding the molecular underpinnings largelyachieved through high-throughput and transgenic technolo-gies in Drosophila
Insects can lose water from the epicuticle through respi-ration via the spiracles or across the gut epithelia (Hadley1994) Drosophila balance water via three main mechanismsAltering water content slowing water loss rate and less
commonly tolerating water loss (Gibbs and Matzkin 2001Gibbs et al 2003) Water retention appears to be the primarymechanism for withstanding desiccation in highly resistantcactophilic Drosophila (Gibbs and Matzkin 2001) as well as inDrosophila melanogaster (Telonis-Scott et al 2006) Howwater is preserved however is highly variable depending onthe species andor genetic background Cactophilic specieshave repeatedly colonized arid habitats via reduced metabolicrates that stem respiratory and cuticular water loss andextend energy utilization (Gibbs and Matzkin 2001 Marronet al 2003) while in the laboratory water retention and se-questration arise from multiple evolutionary pathways lessclearly related to metabolic rate (Hoffmann and Parsons1989a) often acting in concert (Hoffmann and Parsons1989a Djawdan et al 1998 Folk et al 2001 Telonis-Scottet al 2006 2012) Less resolved is the role of respiratory rel-ative to cuticular water loss via discontinuous gas exchangealthough this might be an important water budget strategyfor insects in general (Chown et al 2011)
Recent candidate gene-based approaches have seen newdevelopments in understanding specific molecular aspects ofthis variable ecological trait in Drosophila Diuretic peptide
signaling in ion and water balance regulated by excretion viathe Malpighian tubules (MTs) and gut absorption has beenshown to impact desiccation resistance in D melanogaster(Kahsai et al 2010 Terhzaz et al 2012 2014 2015) Specificallyknockdown of the Capability (Capa) neuropeptide gene en-hanced desiccation resistance by reductions in respiratorycuticular and excretory water loss while also cross-conferringcold tolerance (Terhzaz et al 2015) Cuticular hydrocarbon(CHC) levels are also implicated a single gene encoding a fattyacid synthase mFAS was shown to impact both desiccationresistance and mate choice in Australian rainforest endemicsDrosophila serrata and Drosophila birchii (Chung et al 2014)In D melanogaster knockdown of CYP4G1 also reduced CHCproduction and impaired survival under low humidity (Qiuet al 2012) Metabolic signaling has also been shown toimpact desiccation resistance and includes components ofthe insulin signaling pathways (Seurooderberg et al 2011 Liu et al2015) Several other single gene studies highlight differentmechanisms such as desiccation avoidance in larvae via no-ciceptors encoded by members of the transient receptorpotential (TRP) aquaporin family (Johnson and Carder2012) cyclic adenosine monophosphate (cAMP)-dependentsignaling protein kinase desi (Kawano et al 2010) and poten-tial tissue protection via trehalose sugar accumulation(Thorat et al 2012)
Collectively these functional approaches highlight thecomplexity of insect water balance strategies in controlledbackgrounds but do not explain the variation observed innatural populations and species Resistance evolves rapidlyand is highly heritable (up to 60) in the cosmopolitan andunusually resistant species D melanogaster whereas heritabil-ity is much lower in less tolerant range-restricted species(Hoffmann and Parsons 1989b Kellermann et al 2009)Multiple genome-wide quantitative trait loci were identifiedfrom mapping lines constructed from a natural D melanoga-ster population suggesting a polygenetic architecture (Foleyand Telonis-Scott 2011) Transcriptomics revealed differentialregulation of thousands of genes in response to desiccation inDrosophila mojavensis (Matzkin and Markow 2009) whileartificial selection for desiccation resistance altered the basalexpression of over 200 genes in D melanogaster (Soslashrensenet al 2007)
Previously we used microarray-based genomic hybridiza-tion to survey allele frequency shifts in experimental evolutionlines of D melanogaster recently derived from the field(Telonis-Scott et al 2012) We documented shifts in over600 loci in response to selection for desiccation resistancefollowing a rapid phenotypic response after only 8 genera-tions of selection This variant identification approach waslimited to highlighting candidate genes and regions andnot actual nucleotide polymorphisms apart from those se-quenced post hoc We now expand on this to further explorethe unresolved natural genomic complexity of desiccationresistance using a high-throughput sequencing Pool-GWAS(genome-wide association study) approach (Bastide et al2013) In a GWAS framework for polygenic traits wheremany small effect genes contribute to a phenotype a largesample size is required for adequate power (Mackay et al
2009) Accordingly we sampled natural genetic variationfrom almost 1000 inseminated wild-caught females andcompared the upper 5 desiccation resistant tail fromalmost 10000 F1 progeny with a control sample chosen ran-domly from the same progeny set Harnessing the power ofPool-GWAS we sequenced a pool of 500 natural ldquoresistantrdquogenomes and compared allele frequencies with a pool of 500ldquorandom controlrdquo genomes
We employed a novel comparative approach with the re-sulting set of candidate loci and those detected in our previ-ous artificial selection study of flies collected several seasonsearlier from the same region of southeastern Australia(Telonis-Scott et al 2012) Cross-study repeatability of lociassociated with complex phenotypes is often low (Sarupet al 2011) and can be influenced by genetic backgroundselection intensity epistatic effects and gene-by-environmentinteractions (Mackay et al 2009 Civelek and Lusis 2014)Multiple genetic solutions appear to underlie the array ofdesiccation responses and we further explored this by inves-tigating overlap at the level of individual genes functionalgene ontology (GO) categories and proteinndashprotein interac-tion (PPI) networks Genetic variants contribute to final phe-notypes by way of ldquointermediate phenotypesrdquo (transcriptprotein and metabolite abundances) and correlationsshould theoretically occur across these biological scales(Civelek and Lusis 2014)
Here we report on a screen of natural nucleotide vari-ants associated with desiccation resistance and using a pow-erful analysis approach demonstrate common cross-studysignatures across different hierarchical levels from gene tonetwork and function We found evidence that some func-tional desiccation candidates may be important in wildpopulations and discovered variants linked to multiple phys-iological responses consistent with the traitrsquos complex under-pinnings at both the physiological and molecular levels
Results
Genome Wide Differentiation for Natural DesiccationResistance
We collected over 1000 D melanogaster isofemale linesfrom southern Australia and after one generation of lab-oratory culture screened over 9000 female progeny fordesiccation resistance The top ~5 desiccation-resistantflies were selected for Illumina sequencing (~500 flies)together with a random sampling of the same numberof flies from the families as the ldquocontrolrdquo pool Ourdesign incorporated a subpooling (technical replicate)strategy to both optimize Pool-seq allele frequencyestimates and control for technical bias on allele fre-quency estimates (see Materials and Methods) For thecontrol and desiccation-resistant samples respectively162Mndash259M and 188Mndash289M raw reads were obtainedper technical replicate and an average of 950 of readsmapped to the reference genome after trimming (mean ofall 10 replicates standard deviation [SD] 05)
Variance in allele frequency among the five technical rep-licates for each pool was low (supplementary fig S1
2
Supplementary Material online mean across control repli-cates 00028 median 00015 mean across resistant replicates00027 median 00016) corresponding to a mean SD of~45 around the mean allele frequency The mean pairwisedifference in allele frequency estimates was 55 and 54 forcontrol and desiccation-resistant replicates respectively con-sistent with previously published estimates (4ndash6 Kofleret al 2016)
For a test subset of 1803 randomly sampled single nucle-otide polymorphisms (SNPs) that showed no significant dif-ferentiation between control and desiccation-resistant poolsthe replicates accounted for the majority of total variance inallele frequency Replicates within pool category (control vsdesiccation resistant supplementary fig S1 SupplementaryMaterial online) explained between 15 and 100 (mean87 median 93) of what was a relatively low total variance(mean 00029 median 00019) The mean concordance cor-relation coefficient was 098 for each pool category higherthan a comparable Drosophila study that reported a correla-tion of 0898 between 2 replicates with 20 individuals(Zhu et al 2012) We therefore combined the technical rep-licates into a single control and resistant pool respectively forfurther analyses After processing the mean coverage depth
across the genome was 487 and 568 per sample respec-tively (equivalent to 097 and 11 per individual in thepools of 500 flies)
Allele Frequency Changes
The allele frequencies of 3528158 SNPs were comparedbetween the 2 pools using Fisherrsquos exact tests with apermutation-based false discovery rate (FDR) correctionchosen based on the top 005 of the simulated nullP values Six hundred forty-eight SNPs were differentiatedbetween desiccation resistant and control pools (fig 1 andsupplementary table S1 Supplementary Material online)The SNPs were nonrandomly distributed across chromo-somes (X2
4 frac14 4312 Plt 00001) highly abundant on theX chromosome (X2
1 frac14 3698 Plt 00001) while SNPs wereunderrepresented on both 2L and 3L (2L X2
1 frac14 641 Plt001 3L X2
1 frac14 397 Plt 005) In the desiccation pool theallelic distribution of the 648 SNPs ranged from low tofixed (4ndash100 fig 1B and supplementary table S1Supplementary Material online) The increase in frequencyof the favored ldquodesiccationrdquo alleles compared with the con-trols ranged from 3 to 41 (median increase 14 supple-mentary table S1 Supplementary Material online) where
Fig 1 (A) Manhattan plot of genome-wide P values for differentiation between desiccation-resistant and random control pools shown for the entiregenome where each point represents one SNP SNPs highlighted in red showed differentiation above the 005 threshold level determined from the nullP value distribution (see text for details) SNPs highlighted in blue were above the 005 threshold and also detected in Telonis-Scott et al (2012) (B)Standing allele frequency in the control pool plotted against the frequency change in the selected pool for the 648 differentiated SNPs (FDR 005)
3
the largest shifts tended to occur in common intermediatefrequencies (fig 1B)
Inversion Frequencies
To test whether inversions were associated with desiccationresistance allele frequencies were checked at SNPs diagnosticfor the following inversions In(3R)P (Anderson et al 2005Kapun et al 2014) In(2L)t (Andolfatto et al 1999 Kapun et al2014) In(2R)NS In(3L)P In(3R)C In(3R)K and In(3R)Mo(Kapun et al 2014) In(3L)P In(3R)K and In(3R)Mo werenot detected in this population and for all other inversionsinvestigated the control and desiccation-resistant pool fre-quencies did not differ more than 6
Genomic Distribution
We examined these candidates using a comprehensive ap-proach encompassing SNP gene functional ontology andnetwork levels First we examined SNP locations with respectto gene structure and putative effects of SNPs on genomefeatures The 648 SNPs mapped to 382 genes and were largelynoncoding (79 supplementary table S2 SupplementaryMaterial online) Coding variants comprised nearly 15 ofcandidate SNPs largely synonymous substitutions and theremainder nonsynonymous missense variants predicted toresult in amino acid substitution with length preservation(9 and 14 supplementary table S2 SupplementaryMaterial online) Based on proportions of features from allannotated SNPs compared with the candidates intron vari-ants were significantly underrepresented (X2
1 frac14706 Plt 001supplementary table S2 Supplementary Material online)while SNPs were enriched in 50UTRs 50UTR premature startcodon sites and splice regions (X2
1 frac14 820 Plt 001X2
1 frac14 3124 Plt 00001 X21 frac14 640 Plt 005 respectively
supplementary table S2 Supplementary Material online)
GO Functional Enrichment
For the GO analyses we found no overrepresented termsusing the stringent Gowinda gene mode but identified sev-eral categories using SNP mode with a 005 FDR thresholdGO terms were enriched for the broad categories of ldquochro-matinrdquo ldquochromosomerdquo and more specifically ldquodouble-stranded RNA bindingrdquo although significance of this categorywas due solely to nine differentiated SNPs in the DIP1 geneThe term ldquogamma-secretase complexrdquo was also enriched dueto three SNPs in the pen-2 gene
Desiccation Candidate Genes
Chown et al (2011) comprehensively summarized the pri-mary mechanisms underpinning desiccation resistancefrom the critical first phase of stress sensing and behavioralavoidance to the manifold physiological trajectories availableto counter water stress These include water content varia-tion water loss rates (desiccation resistance) and tolerance ofwater loss (dehydration tolerance) We expanded on thisreview to incorporate recent molecular research and summa-rized the primary mechanisms and known candidate genesrelevant to D melanogaster stress sensing and water balance
(table 1 and references therein) Informed by this body ofwork we applied a ldquotop-downrdquo mechanistic approach toscreening desiccation candidate genes in our data (egfrom initial hygrosensing to downstream key physiologicalpathways table 1) We did not observe differentiation inSNPs mapping to the TRP ion channels the key hygrosensingreceptor genes However we did map SNPs to genes involvedin stress sensing specific to the MT fluid secretion signalingpathways including two of the six cAMP and cyclic guanosinemonophosphate (cGMP) signaling phosphodiesterases dncand pde9 as well as klu indicated in NF-kB ortholog signalingin the renal tubules (table 1) A number of the stress-sensingSAPK pathway genes were differentiated (table 1) Key genesinvolved in the insulin signaling pathway implicated in met-abolic homeostasis and water balance were also differenti-ated including the insulin receptor InR and insulin receptorbinding Dok (table 1)
Network Analysis Modeling Genomic Differentiationfor Desiccation in a PPI Context
Given the genomic complexity underpinning desiccation re-sistance we also implemented a higher-order analysis to pro-vide an overview of potential architectures beyond the SNPand gene levels using PPI networks The SNPs mapped to 382genes which were annotated to 307 seed proteins occurring inthe full background D melanogaster PPI network The seedproteins produced a large first-order network with 3324nodes and 51299 edges
To determine if broader functional signatures could beascertained from the complex network we employed func-tional enrichment analyses on the 3324 nodes (table 2 andsupplementary table S3 Supplementary Material online)Strikingly the network was enriched for pathways involvedin gene expression from transcription through RNA process-ing to RNA transport metabolism and decay (table 2)The network analysis also revealed further complexity ofthe stress response with enrichment of proteins involved inregulating innate insect immunity (table 2) where multiplepathways were overrepresented including cytokine interleu-kin and toll-like receptor signaling cascades (table 2)Developmentalgene regulation signaling pathways were sig-nificant in the pathway analysis the KEGG database showedenrichment for Notch and Mitogen-activated protein kinase(MAPK) signaling while insulin and NGF pathways were high-lighted from the Reactome database (table 2)
Common Signatures of Desiccation Resistance acrossStudies
The key finding of this study is the degree of genomic overlapwith our previous mapping of alleles associated with artifi-cially selected desiccation phenotypes (Telonis-Scott et al2012) Although the flies were sampled from comparablesoutheastern Australian locations the study designs were dif-ferent Specifically the earlier study focused on resistancegenerated by artificial selection whereas this study focusedon natural variation in resistance Despite the differences instudy design we identified 45 genes common to both studies
4
Table 1 Summary of Primary Molecular Responses and Candidate Genes for Desiccation Response Mechanisms Identified in the Literature
Mechanism Molecular Function Genes Current Study(F1 wild-caught)
Telonis-Scott et al(2012) (F8 artificial
selection)
Hygrosensing Sensory moisture receptors (antennae legs gut mouthparts)
Temperature-activatedtransient receptor po-tential ionchannelsbcd
TRP ion channels trpl trp Trpg TrpA1 TrpmTrpml pain pyx wtrwnompC iav nan Amo
mdash trpl
Stress sensing Malpighian tubules (principal cells)
Tubule fluid secretion signaling pathways
cAMP signalingefg cAMP activation Dh44-R2 mdash mdash
cGMP signalingefg cGMP activation Capa CapaRcedilNplp1-4 mdash mdash
Receptors Gyc76c CG33958 CG34357 mdash CG34357
Kinases dg1-2 mdash mdash
cAMP + cGMPsignalingefgh
Hydrolyzing phosphodiesterases dnc pde1c pde6 pde9 pde11 dnc pde9 dnc pde1c pde9 pde11
Calcium signalingeg Calcium-sensitive nitric oxidesynthase
Nos mdash Nos
Desiccation specific Diuretic neuropeptide NFkBsignalingijk
Capa CapaR trpl Relish klu klu klu trpl
Anti-diureticneuropeptidel ITP sNPF Tk mdash ITP sNPF
Stress sensing Stress responsive pathway
Stress activatedprotein kinasepathwaymn
MAPKJun kinase
Aop Aplip1 Atf-2 bsk Btk29Acbt Cdc 42 Cka cno CYLDDok egr Gadd45 hep hppyJra kay medo Mkk4 msnMtl Pak puc pyd Rac 1-2Rho1 shark slpr Src42ASrc64B Tak1 Traf6
Aop Dokhepcedilkaymsn Mtlpuc Rac1Src64B
Mtl slpr
Metabolic homeostasis and water balance
Components of insu-lin signaling pathway
Receptorop InR InR mdash
Receptor binding chico IIp1-8 dock Dok poly Dok Poly
Resistance mechanisms Water loss barriers
Cuticularhydrocarbons
Fatty acid synthasesq desatura-sesr transporters oxidativedecarbonylaset
CG3523 CG3524 desat1-2 FatPCyp4g1
CG3523 mdash
Primary hemolymph sugartissue-protectant
Trehalose metabolism Trehalose-phosphatasebu treh-lase activitybu phosphoglyc-erate mutase
Tps1 Treh Pgm CG5171CG5177 CG6262 crc
Treh Pgm mdash
NOTEmdashCandidate genes differentiated in the current study and in Telonis-Scott et al (2012) are shown and are underlined where overlappingaFor brevity several comprehensive reviews are cited and readers are referred to references therein Note that water budgeting via discontinuous gas exchange was not includeddue to a lack of ldquobona fiderdquo molecular candidates and desiccation tolerance due to aquaporins was omitted as genes were not differentiated in either of our studiesbChown et al (2011)cJohnson and Carder (2012)dFowler and Montell (2013)eDay et al (2005)fDavies et al (2012)gDavies et al (2013)hDavies et al (2014)iTerhzaz et al (2012)jTerhzaz et al (2014)kTerhzaz et al (2015)lKahsai et al (2010)mHuang and Tunnacliffe (2004)nHuang and Tunnacliffe (2005)oSeurooderberg et al (2011)pLiu et al (2015)qChung et al (2014)rSinclair et al (2007)rFoley and Telonis-Scott (2011)tQiu et al (2012)uThorat et al (2012)
5
Conserved Genomic Signatures of Drosophila Desiccation Resistance doi101093molbevmsv349 MBE by guest on February 6 2016
httpmbeoxfordjournalsorg
Dow
nloaded from
out of the 382 and 416 genes for the current and 2012 studyrespectively (Fisherrsquos exact test for overlap P lt 2 1016odds ratio frac14 590 supplementary table S4 SupplementaryMaterial online) Bearing gene length bias in mind we exam-ined the average length of genes in both studies as well as thegene overlap to the average gene length in the Drosophilagenome (supplementary fig S3 Supplementary Materialonline) Although there was bias toward longer genes in theoverlap list this was also apparent in the full gene lists fromboth studies
The overlap genes mapped to all of the major chromo-somes and their differentiated variants were mostly noncod-ing SNPs located in introns (supplementary table S4Supplementary Material online) Two SNPs were synonymous(Strn-Mlick coding for a protein kinase and Cht7 chitinbinding supplementary table S4 Supplementary Materialonline) while a 30UTR variant was predicted to putativelyalter protein structure in a nondisruptive manner mostlikely impacting expression andor protein translational effi-ciency (jbug response to stimulus supplementary table S4Supplementary Material online) One nonsynonymous SNPwas predicted to alter protein function as a missense variant(fab1 signaling supplementary table S4 SupplementaryMaterial online) The increase in frequency of the favoreddesiccation alleles compared with the controls ranged from5 to 27 with a median increase of 164
Despite the strong overlap there were important differ-ences between the two studies For example our previouswork revealed evidence of a selective sweep in the 50 pro-moter region of the Dys gene in the artificially selected linesand we confirmed a higher frequency of the selected SNPs inan independent natural association study in resistant fliesfrom Coffs Harbour (Telonis-Scott et al 2012) In this studywe found no differentiation at the Dys locus between ourresistant and random controls However both resistant andcontrol pools had a high frequency of the selected allele de-tected in Telonis-Scott et al (2012) (supplementary table S5Supplementary Material online)
The 45 genes common to both studies were analyzed forbiological function while many genes are obviously pleiotro-pic (ie assigned multiple GO terms) almost half were in-volved in response to stimuli (cellular behavioral andregulatory) including signaling such as Rho protein signaltransduction and cAMP-mediated signaling (table 3) Othercategories of interest include spiracle morphogenesis sensoryorgan development including development of the compoundeye and stem cell and neuroblast differentiation (table 3)
With regard to specific stress responsedesiccation candi-dates (table 1) 4 of the 45 genes overlapped including thecAMPcGMP signaling phosphodiesterases pde9 and dncMAPK pathway genes Mtl and klu indicated in NF-kBorthologue signaling in the renal tubules (table 1)Furthermore there was overlap among mechanistic catego-ries where different genes in the same pathways weredetected For example additional key cAMPcGMP hydrolyz-ing phosphodiesterases pde1c and pde11 were significant inTelonis-Scott et al (2012) as well as slpr (MAPK pathway)and poly (insulin signaling) (table 1)
Table 2 Pathway Enrichment of the GWAS and Telonis-Scott et alFirst-Order PPI Networks Investigated Using FlyMine (FDR P lt 005)
Current Study GWAS Telonis-Scott et al(2012)
Database Pathway Pathway
KEGG Ribosome RibosomeNotch signaling pathway Notch signaling pathwayMAPK signaling pathwayDorso-ventral axis formationProgesterone-mediated oocyte
maturationWnt signaling pathwayEndocytosisProteasome
Reactome Immune system Immune systemAdaptive immune system Adaptive immune systemInnate immune system Innate immune systemSignaling by interleukins Signaling by interleukins(Toll-like receptor cascades) Toll-like receptor
cascadesTCR signaling(B cell receptor signaling)
Gene expression Gene expressionMetabolism of RNA Metabolism of RNANonsense-mediated decay Nonsense-mediated decayTranscriptionmRNA processing(RNA transport)Metabolism of proteins (Metabolism of
proteins)Translation (Translation)Cell cycle Cell cycle
Cell cycle checkpointsRegulation of mitotic cell
cycleMitotic metaphase and
anaphaseG0 and early G1 G0 and early G1
G1 phase(G1S transition)G2M transition(DNA replicationS
phase)DNA repairDouble-strand break
repairSignaling pathways Signaling pathwaysInsulin signaling pathwaySignaling by NGF Signaling by NGF
Signaling by FGFRSignaling by NOTCHSignaling by WntSignaling by EGFRDevelopmentAxon guidanceMembrane traffickingGap junction trafficking
and regulationProgramed cell deathApoptosisHemostasisHemostasis
NOTEmdashKEGG and Reactome pathway databases were queried All significant termsare reported with Reactome terms grouped together under parent terms in italictext (full list of terms available in supplementary table S3 Supplementary Materialonline) Parent terms without brackets are the names of pathways that were sig-nificantly enriched parent terms in parentheses were not listed as significantlyenriched themselves but are reported as a guide to the category contents
6
In addition to the significant gene-level overlap betweenthe two studies we observed significant overlap between thePPI networks constructed from each set of candidate genes(summarized in fig 2) We examined the overlap by compar-ing the first-order networks 1) at the level of node overlap and2) at the level of edge (interaction) overlap Three hundredseven GWAS seeds formed a network of 3324 nodes whileour previous set of candidates mapped to 337 seeds forming anetwork that comprised 3215 nodes One thousand eighthundred twenty-six or ~55 of the nodes overlapped be-tween networks which was significantly higher than the nulldistribution estimated from 1000 simulations (mean simu-lated overlap SD 1304 143 nodes observed overlap atthe 999th percentile fig 2A and C) The overlap at the inter-action level was not significantly higher than expected (meansimulated overlap SD 25704 4024 edges observed over-lap 30686 edges 89th percentile fig 2B and D) Hive plotswere used to visualize the separate networks (supplementaryfig S4A and B Supplementary Material online) as well as theoverlap between the networks (supplementary fig S4CSupplementary Material online)
Given the high degree of shared nodes from distinct setsof protein seeds we compared the networks for biologicalfunction (fig 2E) Both networks produced long lists of sig-nificantly enriched GO terms (supplementary table S3Supplementary Material online) and assessing overlap was
difficult due to the nonindependent nature of GO hierar-chies However a comparison of functional pathway enrich-ment revealed some differences Although both networksshowed enrichment in the immune system pathways theribosome and Notch and NGF signaling pathways as well assome involvement of gene expression and translation path-ways the network constructed from the Telonis-Scott et al(2012) gene list showed extensive involvement of cell cyclepathway proteins as well as significant enrichment for Wntsignaling EGRF signaling DNA double-strand break repairaxon guidance gap junction trafficking and apoptosis path-ways which were not enriched in the GWAS network TheGWAS network was specifically enriched for dorso-ventralaxis formation oocyte maturation and many more geneexpression pathways than the Telonis-Scott et al (2012)gene network (table 2 and supplementary table S3Supplementary Material online)
Discussion
Genomic Signature of Desiccation Resistance fromNatural Standing Genetic Variation
We report the first large-scale genome-wide screen for nat-ural alleles associated with survival of low humidity condi-tions in D melanogaster By combining a powerful naturalassociation study with high-throughput Pool-seq we
Table 3 Overrepresentation of GO Categories for the 45 Core Candidate Genes Overlapping between the Current and Our Previous(Telonis-Scott et al 2012) Genomic Study
Term ID TermSubterms FDR P valuea Genes
Cellularbehavioral response to stimulus defense and signaling
GO0050896 Response to stimulus (cellular response tostimulus regulation of response tostimulus)
0007 Abd-B dnc fz mam nkd slo ct fru kluRbf klg santa-maria jbug fab1CG33275 Mtl Trim9 cv-c Gprk2 Ect4CG43658
GO0048585 Negative regulation of response to stimu-lus (Rho protein signal transductionsignal transduction)
0042 dnc mam slo fru Trim9 cv-c
GO0023052 Signaling (cAMP-mediated signaling cellcommunication)
0031 Amph Pde9 dnc fz mam nkd ct kluRbf santa-maria fab1 CG33275 MtlTrim9 cv-c Gprk2 Ect4 CG43658
Development
GO0048863 Stem cell differentiation 0014 Abd-D mam nkd klu nub
GO0014016 Neuroblast differentiation 0048
GO0045165 Cell fate commitment 0010 Abd-D dnc fz mam nkd ct klu klg nub
GO0035277 Spiracle morphogenesis open trachealsystem
0010 Abd-D ct cv-c
GO0007423 Sensory organ development (eye wingdisc imaginal disc)
0008 fz mam ct klu klg sdk Amph Nckx30cMtl
GO0048736 Appendage development 0020 fz mam ct fru CG33275 Mtl Fili nubcv-c Gprk2 CG43658
GO0007552 Metamorphosis 0001 fz mam ct fru CG33275 Mtl Fili cv-cGprk2 CG43658
Cellular component organization
GO0030030 Cell projection organization 0020 dnc fz ct fru GluClalpha Nckx30c MtlTrim9 nub cv-c
NOTEmdashNote that most genes are assigned to multiple annotations and therefore are shown purely as a guide to putative functionaBonferronindashHochberg FDR and gene length corrected P values using the Flymine v400 database Gene Ontology Enrichment The terms were condensed and trimmed usingREVIGO
7
mapped SNPs associated with desiccation resistance in 500naturally derived genotypes Although Pool-seq has limitedpower to reveal very low frequency alleles to reconstructphase or adequately estimate allele frequencies in smallpools (Lynch et al 2014) the Pool-GWAS approach hasbeen validated by retrieval of known candidate genes in-volved in body pigmentation using similar numbers of ge-notypes as assessed here (Bastide et al 2013)
Although studies in controlled genetic backgrounds reportlarge effects of single genes on desiccation resistance (Table 1)our current and previous screens revealed similarly largenumbers of variants (over 600) mapping to over 300 genesA polygenic basis of desiccation resistance is consistent withquantitative genetic data for the trait (Foley and Telonis-Scott2011) and we also anticipate more complexity from naturalD melanogaster populations with greater standing geneticvariation than inbred laboratory strains However somefalse positive associations are likely in the GWAS due to thelimitations of quantitative trait mapping approaches Pooledevolve-and-resequence experiments routinely yield vast num-bers of candidate SNPs partly due to high levels of standinglinkage disequilibrium (LD) and hitchhiking neutral alleles
(Schleurootterer et al 2015) Large population sizes from specieswith low levels of LD such as D melanogaster somewhat cir-cumvent this issue and a notable feature of our experimentaldesign is our sampling Here we selected the most desicca-tion-resistant phenotypes directly from substantial standinggenetic variation where one generation of recombinationshould largely reflect natural haplotypes
We observed the largest candidate allele frequency differ-ences in loci that had intermediate allele frequencies in thecontrol pool This is in line with population genetics modelswhich predict that evolution from standing genetic variationwill initially be strongest for intermediate frequency alleleswhich can facilitate rapid change (Long et al 2015) Howeverwe also observed a number of low frequency alleles which ifbeneficial can inflate false positives due to long-range LD(Orozco-terWengel et al 2012 Nuzhdin and Turner 2013Tobler et al 2014) Nonetheless spurious LD associationscannot fully account for our candidate list as evidenced bythe desiccation functional candidates and cross-study over-lap Furthermore chromosome inversion dynamics can alsogenerate both short and long-range LD and contribute nu-merous false positive SNP phenotype associations (Hoffmann
Fig 2 Observed versus simulated network overlap between the GWAS first-order PPI network and the first-order PPI network from the Telonis-Scottet al (2012) gene list Overlap is presented in terms of (A) node number and (B) edge number Networks were simulated by resampling from the entireDrosophila melanogaster gene list (r553) as described in the text The histogram shows the distribution of overlap measures for 1000 simulations Theblack line represents the histogram as density and the blue line shows the corresponding normal distribution The red vertical line shows the observedoverlap from the real networks labeled as a percentile of the normal distribution Area-proportional Venn diagrams summarized the extent of overlapfor PPI networks constructed separately from the GWAS candidates and Telonis-Scott et al (2012) candidates (labeled ldquoprevious studyrdquo) (C) Overlapexpressed as the number of nodes Approximately 55 of the nodes overlapped between the two studies which was significantly higher than simulatedexpectations (999 percentile 2B) (D) Overlap expressed in the number of interactions This was not significantly higher than expected by simulation(89th percentile 2B) (E) Overlap at the level of GO biological process terms (directly compared GO terms by name this was not tested statisticallybecause of the complex hierarchical nature of GO terms and is presented for interest)
8
and Weeks 2007 Tobler et al 2014) but we found little as-sociation between the desiccation pool and inversion fre-quency changes consistent with the lack of latitudinalpatterns in desiccation resistanceand traitinversion fre-quency associations in east Australian D melanogaster(Hoffmann et al 2001 Weeks et al 2002)
Cross-Study Comparison Reveals a Core Set ofCandidates for a Complex Trait
Our data contain many candidate genes functions and path-ways for insect desiccation resistance but partitioning genu-ine adaptive loci from experimental noise remains asignificant challenge We therefore focus our efforts on ournovel comparative analysis where we identified 45 robustcandidates for further exploration Common genetic signa-tures are rarely documented in cross-study comparisons ofcomplex traits (Sarup et al 2011 Huang et al 2012 but cfVermeulen et al 2013) and never before for desiccation re-sistance Some degree of overlap between our studies likelyresults from a shared genetic background from similar sam-pling sites highlighting the replicability of these responsesOne limitation of this approach is the overpopulation ofthe ldquocorerdquo list by longer genes This result reflects the genelength bias inherent to GWAS approaches an issue that is notstraightforward to correct in a GWAS let alone in the com-parative framework applied here A focus on the commongenes functionally linked to the desiccation phenotype canprovide a meaningful framework for future research and alsosuggests that some large-effect candidate genes exhibit natu-ral variation contributing to the phenotype (table 1)
Other feasible candidates include genes expressed inthe MTs (Wang et al 2004) or essential for MT develop-ment (St Pierre et al 2014) providing further researchopportunity focused on the primary stress sensing andfluid secretion tissues Genes include one of the mosthighly expressed MT transcripts CG7084 as well as dncNckx30C slo cv-c and nkd Other aspects of stress resis-tance are also implicated CrebB for example is involvedin regulating insulin-regulated stress resistance and ther-mosensory behavior (Wang et al 2008) nkd also impactslarval cuticle development and shares a PDZ signalingdomain also present in the desiccation candidate desi(Kawano et al 2010 St Pierre et al 2014) Finally severalcore candidates were consistently reported in recent stud-ies suggesting generalized roles in stress responses andfitness including transcriptome studies on MT function(Wang et al 2004) inbreeding and cold resistance(Vermeulen et al 2013) oxidative stress (Weber et al2012) and genomic studies exploring age-specific SNPson fitness traits (Durham et al 2014) and genes likelyunder spatial selection in flies from climatically divergenthabitats (Fabian et al 2012)
Several of the enriched GO categories from the core listalso make biological sense including stimulus response(almost half the suite) defense and signaling particularlyin pathways involved in MT stress sensing (Davies et al 20122013 2014) Functional categories enriched for sensory
organ development were highlighted consistent with envi-ronmental sensing categories from desert Drosophila tran-scriptomics (Matzkin and Markow 2009 Rajpurohit et al2013) and stress detection via phototransduction pathwaysin D melanogaster under a range of stressors including des-iccation (Soslashrensen et al 2007)
Evidence for Conserved Signatures at Higher LevelOrganization
Beyond the gene level we observed a striking degree ofsimilarity between this study and that of Telonis-Scottet al (2012) at the network level where significantly over-lapping protein networks were constructed from largely dif-ferent seed protein sets This supports a scenario where theldquobiological information flow from DNA to phenotyperdquo(Civelek and Lusis 2014) contains inherent redundancywhere alternative genetic solutions underlie phenotypesand functions Resistance to perturbation by genetic or en-vironmental variation appears to increase with increasinghierarchical complexity that is phenotype ldquobufferingrdquo (Fuet al 2009) The same functional network in different pop-ulationsgenetic backgrounds can be impacted but theexact genes and SNPs responding to perturbation will notnecessarily overlap In Drosophila different sets of loci fromthe same founders mapped to the same PPI network in thecase of two complex traits chill coma recovery and startleresponse (Huang et al 2012) Furthermore interspecificshared networkspathways from different genes have beendocumented for numerous complex traits (Emilsson et al2008 Aytes et al 2014 Ichihashi et al 2014) providing op-portunities for investigating taxa where individual genefunction is poorly understood
Although the null distribution is not always obviouswhen comparing networks and their functions partly be-cause of reduced sensitivity from missing interactions(Leiserson et al 2013) here the shared network signaturesportray a plausible multifaceted stress response involvingstress sensing rapid gene accessibility and expression Thepredominance of immune system and stress responsepathways are of interest with reference immunitystresspathway ldquocross-talkrdquo particularly in the MTs whereabiotic stressors elicit strong immunity transcriptionalresponses (Davies et al 2012) The higher order architec-tures reflect crucial aspects of rapid gene expression con-trol in response to stress from chromatin organization(Alexander and Lomvardas 2014) to transcription RNAdecay and translation (reviwed in de Nadal et al 2011) InDrosophila variation in transcriptional regulation of indi-vidual genes can underpin divergent desiccation pheno-types (Chung et al 2014) or be functionally inferred frompatterns of many genes elicited during desiccation stress(Rajpurohit et al 2013) Variants with potential to mod-ulate expression at different stages of gene expressioncontrol were evident in our GWAS data alone for exam-ple enrichment for 50UTR and splice variants at the SNPlevel and for chromatin and chromosome structure termsat the functional level
9
Genetic Overlap Versus Genetic differencesmdashCausesand Implications
Despite the common signatures at the gene and higher orderlevels between our studies pervasive differences were ob-served These can arise from differences in standing geneticvariation in the natural populations where different alleliccompositions impact gene-by-environment interactions andepistasis Our network overlap analyses highlight the potentialfor different molecular trajectories to result in similar func-tions (Leiserson et al 2013) consistent with the multiplephysiological pathways potentially available to D melanoga-ster under water stress (Chown et al 2011) Experimentaldesign also presumably contributed to the study differencesStrong directional selection as applied in the 2012 study canincrease the frequency of rare beneficial alleles and likely re-duced overall diversity than did the single-generation GWASdesign Furthermore selection regimes often impose addi-tional selective pressures (eg food deprivation during desic-cation) resulting in correlated responses such as starvationresistance with desiccation resistance and these factors differfrom the laboratory to the field Finally evidence suggests innatural D melanogaster that standing genetic variation canvary extensively with sampling season (Itoh et al 2010Bergland et al 2014) Although we cannot compare levelsof standing variation between our two studies evidence sug-gests that this may have affected the Dys gene where ourGWAS approach revealed no differentiation between thecontrol and resistant pools at this locus but rather detectedthe alleles associated with the selective sweep in Telonis-Scottet al (2012) in both pools at high frequency Whether thisresulted from sampling the later collection after a longdrought (2004ndash2008 wwwbomgovauclimate last accessedJanuary 13 2016) is unclear but this observation highlightsthe relevance of temporal sampling to standing genetic var-iation when attempting to link complex phenotypes togenotypes
Materials and Methods
Natural Population Sampling
Drosophila melanogaster was collected from vineyard waste atKilchorn Winery Romsey Australia in April 2010 Over 1200isofemale lines were established in the laboratory from wild-caught females Species identification was conducted on maleF1 to remove Drosophila simulans contamination and over900 D melanogaster isofemale lines were retained All flieswere maintained in vials of dextrose cornmeal media at25 C and constant light
Desiccation Assay
From each isofemale line ten inseminated F1 females weresorted by aspiration without CO2 and held in vials until 4ndash5days old For the desiccation assay 10 females per line (over9000 flies in total) were screened by placing groups of femalesinto empty vials covered with gauze and randomly assigninglines into 5 large sealed glass tanks containing trays of silicadesiccant (relative humidity [RH] lt10) The tanks werescored for mortality hourly and the final 558 surviving
individuals were selected for the desiccation-resistant pool(558 flies lt6 of the total) The control pool constitutedthe same number of females sampled from over 200 ran-domly chosen isofemale lines that were frozen prior to thedesiccation assay We chose this random control approachinstead of comparing the phenotypic extremes (ie the latemortality tail vs the ldquoearly mortalityrdquo tail) because unfavor-able environmental conditions experienced by the wild-caught dam may inflate environmental variance due to car-ryover effects in the F1 offspring (Schiffer et al 2013)
DNA Extraction and Sequencing
We used pooled whole-genome sequencing (Pool-seq)(Futschik and Schleurootterer 2010) to estimate and compareallele frequency differences between the most desiccation-resistant and randomly sampled flies from a wild populationto associate naturally segregating candidate SNPs with theresponse to low humidity For each treatment (resistantand control) genomic DNA was extracted from 50 femaleheads using the Qiagen DNeasy Blood and Tissue Kit accord-ing to the manufacturerrsquos instructions (Qiagen HildenGermany) Two extractions were combined to create fivepools per treatment which were barcoded separately tocreate technical replicates (total 10 pools of 100 fliesn frac14 1000) The DNA was fragmented using a CovarisS2 ma-chine (Covaris Inc Woburn MA) and libraries were pre-pared from 1 mg genomic DNA (per pool of 100 flies) usingthe Illumina TruSeq Prep module (Illumina San Diego CA)To minimize the effect of variation across lanes the ten poolswere combined in equal concentration (fig 3) and run mul-tiplexed in each lane of a full flow cell of an Illumina HiSeq2000 sequencer Clusters were generated using the TruSeq PECluster Kit v5 on a cBot and sequenced using Illumina TruSeqSBS v3 chemistry (Illumina) Fragmentation library construc-tion and sequencing were performed at the Micromon Next-Gen Sequencing facility (Monash University ClaytonAustralia)
Fig 3 Experimental design See text for further explanation
10
Data Processing and Mapping
Processing was performed on a high-performance computingcluster using the Rubra pipeline system that makes use of theRuffus Python library (Goodstadt 2010) The final variant-calling pipeline was based on a pipeline developed by ClareSloggett (Sloggett et al 2014) and is publicly available athttpsgithubcomgriffinpGWAS_pipeline (last accessedJanuary 13 2016) Raw sequence reads were processed persample per lane with P1 and P2 read files processed sepa-rately initially Trimming was performed with trimmomatic v030 (Lohse et al 2012) Adapter sequences were also re-moved Leading and trailing bases with a quality scorebelow 30 were trimmed and a sliding window (width frac14 10quality threshold frac14 25) was used Reads shorter than 40 bpafter trimming were discarded Quality before and after trim-ming was examined using FastQC v 0101 (Andrews 2014)
Posttrimming reads that remained paired were used asinput into the alignment step Alignment to the D melano-gaster reference genome version r553 was performed withbwa v 062 (Li and Durbin 2009) using the bwa aln commandand the following options No seeding (-l 150) 1 rate ofmissing alignments given 2 uniform base error rate (-n001) a maximum of 2 gap opens per read (-o 2) a maximumof 12 gap extensions per read (-e 12) and long deletionswithin 12 bp of 30 end disallowed (-d 12) The default valueswere used for all other options Output files were then con-verted to sorted bam files using the bwa sampe commandand the ldquoSortSamrdquo option in Picard v 196
At this point the bam files were merged into one file persample using PicardMerge Duplicates were identified withPicard MarkDuplicates The tool RealignerTargetCreator inGATK v 26-5 (DePristo et al 2011) was applied to the tenbam files simultaneously producing a list of intervals contain-ing probable indels around which local realignment would beperformed This was used as input for the IndelRealigner toolwhich was also applied to all ten bam files simultaneously
To investigate the variation among technical replicatesallele frequency was estimated based on read counts foreach technical replicate separately for 100000 randomly cho-sen SNPs across the genome After excluding candidate des-iccation-resistance SNPs and SNPs that may have been falsepositives due to sequencing error (those with mean frequencylt004 across replicates) 1803 SNPs remained For each locuswe calculated the variance in allele frequency estimate amongthe five control replicates and the variance among the fivedesiccation-resistance replicates We compared this with thevariance due to pool category (control vs desiccation resis-tant) using analysis of variance (ANOVA) We also calculatedthe mean pairwise difference in allele frequency estimateamong control and among desiccation-resistance replicatesand the mean concordance correlation coefficient was calcu-lated over all pairwise comparisons within pool category(using the epiccc function in the epiR package Stevensonet al 2015)
The following approach was then taken to identify SNPsdiffering between the desiccation-resistant and controlgroups Based on the technical replicate analysis we merged
the five ldquodesiccation-resistantrdquo and the five random controlsamples into two bam files A pileup file was created withsamtools v 0119 (Li et al 2009) and converted to a ldquosyncrdquo fileusing the mpileup2sync script in Popoolation2 v 1201 (Kofleret al 2011) retaining reads with a mapping quality of at least20 and bases with a minimum quality score of 20 Reads withunmapped mates were also retained (option ndashA in samtoolsmpileup) For this step we used a repeat-masked version ofthe reference genome created with RepeatMasker 403 (Smitet al 2013-2015) and a repeat file containing all annotatedtransposons from D melanogaster including shared ancestralsequences (Flybase release r557) plus all repetitive elementsfrom RepBase release v1904 for D melanogaster Simple re-peats small RNA genes and bacterial insertions were notmasked (options ndashnolow ndashnorna ndashno_is) The rmblastsearch engine was used
Association Analysis
We then performed Fisherrsquos exact test with the Fisher testscript in Popoolation2 (Kofler et al 2011) We set the min-imum count (of the minor allele) frac14 10 minimum coverage(in both populations) frac14 20 and maximum coverage frac1410000 The test was performed for each SNP (slidingwindow off) Although there are numerous approaches tocalculating genome-wide significance thresholds in aPoolseq framework currently there is no consensus on anappropriate standardized statistical method Methods in-clude simulation approaches (Orozco-terWengel et al2012 Bastide et al 2013 Burke et al 2014) correctionusing a null distribution based on estimating genetic driftfrom replicated artificial selection line allele frequency vari-ance (Turner and Miller 2012) use of a simple thresholdbased on a minimum allele frequency difference (Turneret al 2010) and the FDR correction suggested by Storeyand Tibshirani (2003) (used by Fabian et al 2012)
We determined based on our design to calculate the FDRthreshold by comparison with a simulated distribution of nullP values following the approach of Bastide et al (2013) Foreach SNP locus a P value was calculated by simulating thedistribution of read counts between the major and minorallele according to a beta-binomial distribution with meana(a+b) and variance (ab)((a+b)2(a+b+1)) Thismodel allowed for two types of sampling variationStochastic variation in allele sampling across the phenotypesand the background variation in the representation of allelesin each pool caused by library preparation artifacts or sto-chastic variation driven by unequal coverage of the poolsThese sources of variation have also been recognized else-where as being important in interpreting pooled sequenceallele calls and identifying significant differentiation betweenpooled samples (Lynch et al 2014) The value of afrac14 37 waschosen by chi-square test as the best match to the observedP value distribution (afrac14 10 to afrac14 40 were tested) with acorresponding bfrac14 4315 to reflect the coverage depth differ-ence between the random control (487) and desiccation-resistant (568) pools These values were then used tosimulate a null data set 10 the size of the observed data
11
set Because the null P value distribution still did not fit theobserved P value distribution particularly well (supplemen-tary fig S2 Supplementary Material online) these values werenot used for a standard FDR correction Instead a P valuethreshold was calculated based on the lowest 005 of thenull P values similar in approach to Orozco-terWengel et al(2012)
Inversion Analysis
In D melanogaster several cosmopolitan inversion polymor-phisms show cross-continent latitudinal patterns that explaina substantial proportion of clinal variation for many traitsincluding thermal tolerance (reviewed in Hoffmann andWeeks 2007) Patterns of disequilibria generated betweenloci in the vicinity of the rearrangement can obscure signalsof selection (Hoffmann and Weeks 2007) and inversion fre-quencies are routinely considered in genomic studies ofnatural populations sampled from known latitudinal clines(Fabian et al 2012 Kapun et al 2014) We assessedassociations between inversion frequencies and our controland resistant pools at SNPs diagnostic for the following in-versions In3R(Payne) (Anderson et al 2005 Kapun et al2014) In(2L)t (Andolfatto et al 1999 Kapun et al 2014)In(2R)Ns In(3L)P In(3R)C In(3R)K and In(3R)Mo (Kapunet al 2014) Between one and five diagnostic SNPs were in-vestigated for each inversion (depending on availability) Weconsidered an inversion to be associated with desiccationresistance if the frequency of its diagnostic SNP differed sig-nificantly between the control and desiccation-resistant poolsby a Fisherrsquos exact test
Genome Feature Annotation
The list of candidate SNPs was converted to vcf format using acustom R script setting the alternate allele as the allele thatincreased in frequency from the control to the desiccation-resistant pool and setting the reference allele as the otherallele present in cases where both the major and minor allelesdiffered from the reference genome The genomic features inwhich the candidate SNPs occurred were annotated usingsnpEff v40 first creating a database from the D melanogasterv553 genome annotation with the default approach(Cingolani et al 2012) The default annotation settings wereused except for setting the parameter ldquo-ud 0rdquo to annotateSNPs to genes only if they were within the gene body Using2 tests on SNP counts we tested for over- or under-enrichment of feature types by calculating the proportionof lsquoallrsquo features from all SNPs detected in the study and theproportion of ldquocandidaterdquo SNPs identified as differentiated indesiccation-resistant flies (Fabian et al 2012)
GO Analysis
GO analyses was used to associate the candidates with theirbiological functions (Ashburner et al 2000) Previously devel-oped for biological pathway analysis of global gene expressionstudies (Wang et al 2010) typical GO analysis assumes that allgenes are sampled independently with equal probabilityGWAS violate these criteria where SNPs are more likely to
occur in longer genes than in shorter genes resulting in highertype I error rates in GO categories overpopulated with longgenes Gene length as well as overlapping gene biases (Wanget al 2011) were accounted for using Gowinda (Kofler andSchleurootterer 2012) which implements a permutation strategyto reduce false positives Gowinda does not reconstruct LDbetween SNPs but operates in two modes underpinned bytwo extreme assumptions 1) Gene mode (assumes all SNPs ina gene are in LD) and 2) SNP mode (assumes all SNPs in agene are completely independent)
The candidates were examined in both modes but giventhe stringency of gene mode high-resolution SNP modeoutput was utilized for the final analysis with the followingparameters 106 simulations minimum significance frac141 andminimum genes per GO categoryfrac14 3 (to exclude gene-poorcategories) P values were corrected using the Gowinda FDRalgorithm GO annotations from Flybase r557 were applied
Candidate Gene Analysis
In conjunction with the GO analysis we also examinedknown candidate genes associated with survival under lowRH to better characterize possible biological functions of ourGWAS candidates We took a conservative approach to can-didate gene assignment where the SNPs were mapped withinthe entire gene region without up- and downstream exten-sions Genes were curated from FlyBase and the literature(table 1) predominantly using detailed functional experimen-tal evidence as criteria for functional desiccation candidates
Network Analysis
We next obtained a higher level summary of the candidategene list using PPI network analysis The full candidate genelist was used to construct a first-order interaction networkusing all Drosophila PPIs listed in the Drosophila InteractionDatabase v 2014_10 (avoiding interologs) (Murali et al 2011)This ldquofullrdquo PPI network includes manually curated data frompublished literature and experimental data for 9633 proteinsand 93799 interactions For the network construction theigraph R package was used to build a subgraph from the fullPPI network containing the candidate proteins and their first-order interacting proteins (Csardi and Nepusz 2006) Self-connections were removed Functional enrichment analysiswas conducted on the list of network genes using FlyMinev400 (wwwflymineorg last accessed January 13 2016) withthe following parameters for GO enrichment and KEGGReactome pathway enrichment BenjaminindashHochberg FDRcorrection P lt 005 background frac14 all D melanogastergenes in our full PPI network and Ontology (for GO enrich-ment only) frac14 biological process or molecular function
Cross-Study Comparison Overlap with Our PreviousWork
Following on from the comparison between our gene list andthe candidate genes suggested from the literature we quan-titatively evaluated the degree of overlap between the candi-date genes identified in this study with alleles mapped in ourprevious study of flies collected from the same geographic
12
region (Telonis-Scott et al 2012) The ldquooverlaprdquo analysis wasperformed similarly for the GWAS candidate gene and net-work-level approaches First the gene list overlap was testedbetween the two studies using Fisherrsquos exact test where the382 genes identified from the candidate SNPs (this study)were compared with 416 genes identified from candidatesingle feature polymorphisms (Telonis-Scott et al 2012)The contingency table was constructed using the numberof annotated genes in the D melanogaster R553 genomebuilt as an approximation for the total number of genestested in each study and took the following form
where A frac14 gene list from Telonis-Scott et al (2012)n frac14 416 B frac14 gene list from this study n frac14 382 N frac14 totalnumber of genes in D melanogaster reference genome r553n frac14 17106
To annotate genome features in the overlap candidategene list we used the SNP data (this study) which betterresolves alleles than array-based genotyping Due to this lim-itation we did not compare exact genomic coordinates be-tween the studies but considered overlap to occur whendifferentiated alleles were detected in the same gene inboth studies We did however examine the allele frequenciesaround the Dys-RF promoter which was sequenced sepa-rately revealing a selective sweep in the experimental evolu-tion lines (Telonis-Scott et al 2012)
The overlapping candidates were tested for functional en-richment using a basic gene list approach As genes ratherthan SNPs were analyzed here Fly Mine v400 (wwwflymineorg) was applied with the following parameters Gene lengthnormalization BenjaminindashHochberg FDR correction Plt 005Ontology frac14 biological process and default background (allGO-annotated D melanogaster genes)
Finally we constructed a second PPI network from thecandidate genes mapped in Telonis-Scott et al (2012) to ex-amine system-level overlap between the two studies The 416genes mapped to 337 protein seeds and the network wasconstructed as described above The degree of network over-lap between the two studies was tested using simulation Ineach iteration a gene set of the same length as the list fromthis study and a gene set of the same length as the list fromTelonis-Scott et al (2012) were resampled randomly from theentire D melanogaster mapped gene annotation (r553) Eachresampled set was then used to build a first-order network aspreviously described The overlap network between the tworesampled sets was calculated using the graphintersectioncommand from the igraph package and zero-degree nodes(proteins with no interactions) were removed Overlap wasquantified by counting the number of nodes and the numberof edges in this overlap network The observed overlap wascompared with a simulated null distribution for both nodeand edge number where n frac14 1000 simulation iterations
Supplementary MaterialSupplementary figures S1ndashS4 and tables S1ndashS5 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
We thank Robert Good for the fly sampling and JenniferShirriffs and Lea and Alan Rako for outstanding assistancewith laboratory culture and substantial desiccation assaysWe are grateful to Melissa Davis for advice on network build-ing and comparison approaches and to five anonymous re-viewers whose comments improved the manuscript Thiswork was supported by a DECRA fellowship DE120102575and Monash University Technology voucher scheme toMTS an ARC Future Fellowship and Discovery Grants toCMS an ARC Laureate Fellowship to AAH and a Scienceand Industry Endowment Fund grant to CMS and AAH
ReferencesAlexander JM Lomvardas S 2014 Nuclear architecture as an epigenetic
regulator of neural development and function Neuroscience26439ndash50
Anderson AR Hoffmann AA McKechnie SW Umina PA Weeks AR2005 The latitudinal cline in the In(3R)Payne inversion polymor-phism has shifted in the last 20 years in Australian Drosophilamelanogaster populations Mol Ecol 14851ndash858
Andolfatto P Wall JD Kreitman M 1999 Unusual haplotype structureat the proximal breakpoint of In(2L)t in a natural population ofDrosophila melanogaster Genetics 1531297ndash1311
Andrews S 2014 FastQC Available from httpwwwbioinformaticsbbsrcacukprojectsfastqc
Ashburner M Ball CA Blake JA Botstein D Butler H Cherry JM DavisAP Dolinski K Dwight SS Eppig JT et al 2000 Gene ontology toolfor the unification of biology The Gene Ontology Consortium NatGenet 2525ndash29
Aytes A Mitrofanova A Lefebvre C Alvarez Mariano J Castillo-MartinM Zheng T Eastham James A Gopalan A Pienta Kenneth J ShenMM et al 2014 Cross-species regulatory network analysis identifiesa synergistic interaction between FOXM1 and CENPF that drivesprostate cancer malignancy Cancer Cell 25638ndash651
Bastide H Betancourt A Nolte V Tobler R Steuroobe P Futschik ASchleurootterer C 2013 A genome-wide fine-scale map of natural pig-mentation variation in Drosophila melanogaster PLoS Genet9e1003534
Bergland AO Behrman EL OrsquoBrien KR Schmidt PS Petrov DA 2014Genomic evidence of rapid and stable adaptive oscillations overseasonal time scales in Drosophila PLoS Genet 10e1004775
Bonebrake TC Mastrandrea MD 2010 Tolerance adaptation and pre-cipitation changes complicate latitudinal patterns of climate changeimpacts Proc Natl Acad Sci U S A 10712581ndash12586
Bradley TJ 2009 Animal osmoregulation Oxford Oxford UniversityPress
Burke MK Liti G Long AD 2014 Standing genetic variation drivesrepeatable experimental evolution in outcrossing populations ofSaccharomyces cerevisiae Mol Biol Evol 313228ndash3239
Chown SL 2012 Trait-based approaches to conservation physiologyforecasting environmental change risks from the bottom upPhilos Trans R Soc B 3671615ndash1627
Chown SL Nicolson SW 2004 Insect physiological ecology mechanismsand patterns Oxford Oxford University Press
Chown SL Soslashrensen JG Terblanche JS 2011 Water loss in insectsan environmental change perspective J Insect Physiol571070ndash1084
Not in GWAS In GWAS
Not in Telonis-Scott et al (2012) N ndash intersect-AndashB A-intersect
In Telonis-Scott et al (2012) B-intersect Intersect(AB)
13
Chung H Loehlin DW Dufour HD Vaccarro K Millar JG Carroll SB2014 A single gene affects both ecological divergence and matechoice in Drosophila Science 3431148ndash1151
Cingolani P Platts A Wang LL Coon M Nguyen T Wang L Land SJ LuX Ruden DM 2012 A program for annotating and predicting theeffects of single nucleotide polymorphisms SnpEff SNPs in thegenome of Drosophila melanogaster strain w1118 iso-2 iso-3 Fly(Austin) 680ndash92
Civelek M Lusis AJ 2014 Systems genetics approaches to understandcomplex traits Nat Rev Genet 1534ndash48
Clusella-Trullas S Blackburn TM Chown SL 2011 Climatic predictors oftemperature performance curve parameters in ectotherms implycomplex responses to climate change Am Nat 177738ndash751
Csardi G Nepusz T 2006 The igraph software package for complexnetwork research InterJournal Complex Systems 1695 Availablefrom httpigraphorg
Davies SA Cabrero P Overend G Aitchison L Sebastian S Terhzaz SDow JA 2014 Cell signalling mechanisms for insect stress toleranceJ Exp Biol 217119ndash128
Davies SA Cabrero P Povsic M Johnston NR Terhzaz S Dow JA 2013Signaling by Drosophila capa neuropeptides Gen Comp Endocrinol18860ndash66
Davies SA Overend G Sebastian S Cundall M Cabrero P Dow JATerhzaz S 2012 Immune and stress response lsquocross-talkrsquo in theDrosophila Malpighian tubule J Insect Physiol 58488ndash497
Day JP Dow JA Houslay MD Davies SA 2005 Cyclic nucleotide phos-phodiesterases in Drosophila melanogaster Biochem J 388333ndash342
de Nadal E Ammerer G Posas F 2011 Controlling gene expression inresponse to stress Nat Rev Genet 12833ndash845
DePristo MA Banks E Poplin R Garimella KV Maguire JR Hartl CPhilippakis AA del Angel G Rivas MA Hanna M et al 2011 Aframework for variation discovery and genotyping using next-gen-eration DNA sequencing data Nat Genet 43491ndash498
Djawdan M Chippindale AK Rose MR Bradley TJ 1998 Metabolic re-serves and evolved stress resistance in Drosophila melanogasterPhysiol Zool 71584ndash594
Durham MF Magwire MM Stone EA Leips J 2014 Genome-wide anal-ysis in Drosophila reveals age-specific effects of SNPs on fitness traitsNat Commun 54338
Emilsson V Thorleifsson G Zhang B Leonardson AS Zink F Zhu JCarlson S Helgason A Walters GB Gunnarsdottir S et al 2008Genetics of gene expression and its effect on disease Nature452423ndash428
Fabian DK Kapun M Nolte V Kofler R Schmidt PS Schleurootterer C FlattT 2012 Genome-wide patterns of latitudinal differentiation amongpopulations of Drosophila melanogaster from North America MolEcol 214748ndash4769
Foley BR Telonis-Scott M 2011 Quantitative genetic analysis suggestscausal association between cuticular hydrocarbon composition anddesiccation survival in Drosophila melanogaster Heredity 10668ndash77
Folk DG Han C Bradley TJ 2001 Water acquisition and partitioning inDrosophila melanogaster effects of selection for desiccation-resis-tance J Exp Biol 2043323ndash3331
Fowler MA Montell C 2013 Drosophila TRP channels and animal be-havior Life Sci 92394ndash403
Fu J Keurentjes JJB Bouwmeester H America T Verstappen FWA WardJL Beale MH de Vos RCH Dijkstra M Scheltema RA et al 2009System-wide molecular evidence for phenotypic buffering inArabidopsis Nat Genet 41166ndash167
Futschik A Schleurootterer C 2010 The next generation of molecular mar-kers from massively parallel sequencing of pooled DNA samplesGenetics 186207ndash218
Gibbs AG Fukuzato F Matzkin LM 2003 Evolution of water conserva-tion mechanisms in Drosophila J Exp Biol 2061183ndash1192
Gibbs AG Matzkin LM 2001 Evolution of water balance in the genusDrosophila J Exp Biol 2042331ndash2338
Goodstadt L 2010 Ruffus a lightweight Python library for computa-tional pipelines Bioinformatics 262778ndash2779
Hadley NF 1994 Water relations of terrestrial arthropods San Diego(CA) Academic Press
Hoffmann AA Hallas R Sinclair C Mitrovski P 2001 Levels of variationin stress resistance in Drosophila among strains local populationsand geographic regions patterns for desiccation starvation coldresistance and associated traits Evolution 551621ndash1630
Hoffmann AA Parsons PA 1989a An integrated approach to environ-mental stress tolerance and life-history variation desiccation toler-ance in Drosophila Biol J Linn Soc 37117ndash136
Hoffmann AA Parsons PA 1989b Selection for increased desiccationresistance in Drosophila melanogaster additive genetic control andcorrelated responses for other stresses Genetics 122837ndash845
Hoffmann AA Sgro CM 2011 Climate change and evolutionary adap-tation Nature 470479ndash485
Hoffmann AA Weeks AR 2007 Climatic selection on genes and traitsafter a 100 year-old invasion a critical look at the temperate-tropicalclines in Drosophila melanogaster from eastern Australia Genetica129133ndash147
Huang W Richards S Carbone MA Zhu D Anholt RR Ayroles JFDuncan L Jordan KW Lawrence F Magwire MM et al 2012Epistasis dominates the genetic architecture of Drosophila quanti-tative traits Proc Natl Acad Sci U S A 10915553ndash15559
Huang Z Tunnacliffe A 2004 Response of human cells to desiccationcomparison with hyperosmotic stress response J Physiol558181ndash191
Huang Z Tunnacliffe A 2005 Gene induction by desiccation stress inhuman cell cultures FEBS Lett 5794973ndash4977
Ichihashi Y Aguilar-Martınez JA Farhi M Chitwood DH Kumar RMillon LV Peng J Maloof JN Sinha NR 2014 Evolutionary develop-mental transcriptomics reveals a gene network module regulatinginterspecific diversity in plant leaf shape Proc Natl Acad Sci U S A111E2616ndashE2621
Itoh M Nanba N Hasegawa M Inomata N Kondo R Oshima MTakano-Shimizu T 2010 Seasonal changes in the long-distance link-age disequilibrium in Drosophila melanogaster J Hered 10126ndash32
Johnson WA Carder JW 2012 Drosophila nociceptors mediate larvalaversion to dry surface environments utilizing both the painless TRPchannel and the DEGENaC subunit PPK1 PLoS One 7e32878
Kahsai L Kapan N Dircksen H Winther AM Nassel DR 2010 Metabolicstress responses in Drosophila are modulated by brain neurosecre-tory cells that produce multiple neuropeptides PLoS One 5e11480
Kapun M van Schalkwyk H McAllister B Flatt T Schleurootterer C 2014Inference of chromosomal inversion dynamics from Pool-Seq datain natural and laboratory populations of Drosophila melanogasterMol Ecol 231813ndash1827
Kawano T Shimoda M Matsumoto H Ryuda M Tsuzuki S Hayakawa Y2010 Identification of a gene Desiccate contributing to desicca-tion resistance in Drosophila melanogaster J Biol Chem28538889ndash38897
Kellermann V van Heerwaarden B Sgro CM Hoffmann AA 2009Fundamental evolutionary limits in ecological traits driveDrosophila species distributions Science 3251244ndash1246
Kofler R Nolte V Schleurootterer C 2016 The impact of library preparationprotocols on the consistency of allele frequency estimates in Pool-Seq data Mol Ecol Resour 16118ndash122
Kofler R Pandey RV Schleurootterer C 2011 PoPoolation2 identifying dif-ferentiation between populations using sequencing of pooled DNAsamples (Pool-Seq) Bioinformatics 273435ndash3436
Kofler R Schleurootterer C 2012 Gowinda unbiased analysis of gene setenrichment for genome-wide association studies Bioinformatics282084ndash2085
Leiserson MDM Eldridge JV Ramachandran S Raphael BJ 2013Network analysis of GWAS data Curr Opin Genet Dev 23602ndash610
Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 251754ndash1760
Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079
14
Liu Y Luo J Carlsson MA Neuroassel DR 2015 Serotonin and insulin-likepeptides modulate leucokinin-producing neurons that affectfeeding and water homeostasis in Drosophila J Comp Neurol5231840ndash1863
Lohse M Bolger AM Nagel A Fernie AR Lunn JE Stitt M Usadel B 2012RobiNA a user-friendly integrated software solution for RNA-Seq-based transcriptomics Nucleic Acids Res 40W622ndashW627
Long A Liti G Luptak A Tenaillon O 2015 Elucidating the moleculararchitecture of adaptation via evolve and resequence experimentsNat Rev Genet 16567ndash582
Lynch M Bost D Wilson S Maruki T Harrison S 2014 Population-genetic inference from pooled-sequencing data Genome Biol Evol61210ndash1218
Mackay TFC Stone EA Ayroles JF 2009 The genetics of quantitativetraits challenges and prospects Nat Rev Genet 10565ndash577
Marron MT Markow TA Kain KJ Gibbs AG 2003 Effects of starvationand desiccation on energy metabolism in desert and mesicDrosophila J Insect Physiol 49261ndash270
Matzkin LM Markow TA 2009 Transcriptional regulation of metabo-lism associated with the increased desiccation resistance of thecactophilic Drosophila mojavensis Genetics 1821279ndash1288
Murali T Pacifico S Yu J Guest S Roberts GG Finley RL Jr 2011 DroID2011 a comprehensive integrated resource for protein transcrip-tion factor RNA and gene interactions for Drosophila Nucleic AcidsRes 39D736ndashD743
Nuzhdin SV Turner TL 2013 Promises and limitations of hitchhikingmapping Curr Opin Genet Dev 23694ndash699
Orozco-terWengel P Kapun M Nolte V Kofler R Flatt T Schleurootterer C2012 Adaptation of Drosophila to a novel laboratory environmentreveals temporally heterogeneous trajectories of selected alleles MolEcol 214931ndash4941
Picard Picard A set of Java command line tools for manipulating high-throughput seqeuncing (HTS) data and formats Available fromhttpbroadinstitutegithubiopicard
Qiu Y Tittiger C Wicker-Thomas C Le Goff G Young S Wajnberg EFricaux T Taquet N Blomquist GJ Feyereisen R 2012 An insect-specific P450 oxidative decarbonylase for cuticular hydrocarbon bio-synthesis Proc Natl Acad Sci U S A 10914858ndash14863
Rajpurohit S Oliveira CC Etges WJ Gibbs AG 2013 Functional genomicand phenotypic responses to desiccation in natural populations of adesert Drosophilid Mol Ecol 222698ndash2715
Sarup P Soslashrensen JG Kristensen TN Hoffmann AA Loeschcke V PaigeKN Soslashrensen P 2011 Candidate genes detected in transcriptomestudies are strongly dependent on genetic background PLoS One6e15644
Schiffer M Hangartner S Hoffmann AA 2013 Assessing the relativeimportance of environmental effects carry-over effects and speciesdifferences in thermal stress resistance a comparison ofDrosophilids across field and laboratory generations J Exp Biol2163790ndash3798
Schleurootterer C Kofler R Versace E Tobler R Franssen SU 2015Combining experimental evolution with next-generation sequen-cing A powerful tool to study adaptation from standing geneticvariation Heredity 114431ndash440
Sinclair BJ Gibbs AG Roberts SP 2007 Gene transcription during ex-posure to and recovery from cold and desiccation stress inDrosophila melanogaster Insect Mol Biol 16435ndash443
Sloggett C Wakefield M Philip G Pope B 2014 Rubramdashflexible distrib-uted pipelines figshare Available from httpdxdoiorg106084m9figshare895626
Smit AFA Hubley R Green P 2013-2015 RepeatMasker Open-403Available from httpwwwrepeatmaskerorg
Seurooderberg JA Birse RT Neuroassel DR 2011 Insulin production and signalingin renal tubules of Drosophila is under control of tachykinin-relatedpeptide and regulates stress resistance PLoS One 6e19866
Soslashrensen JG Nielsen MM Loeschcke V 2007 Gene expression profileanalysis of Drosophila melanogaster selected for resistance to envi-ronmental stressors J Evol Biol 201624ndash1636
St Pierre SE Ponting L Stefancsik R McQuilton P 2014 FlyBase102mdashadvanced approaches to interrogating FlyBase Nucleic AcidsRes 42D780ndashD788
Stevenson M Nunes T Heuser C Marshall J Sanchez J Thornton RReiczigel J Robison-Cox J Sebastiani P Solymos P et al 2015 epiRTools for the analysis of epidemiological data R package version09-62 Available from httpCRANR-projectorgpackagefrac14epiR
Storey JD Tibshirani R 2003 Statistical significance for genomewidestudies Proc Natl Acad Sci U S A 1009440ndash9445
Telonis-Scott M Gane M DeGaris S Sgro CM Hoffmann AA 2012 Highresolution mapping of candidate alleles for desiccation resistancein Drosophila melanogaster under selection Mol Biol Evol291335ndash1351
Telonis-Scott M Guthridge KM Hoffmann AA 2006 A new set oflaboratory-selected Drosophila melanogaster lines for the analysisof desiccation resistance response to selection physiology and cor-related responses J Exp Biol 2091837ndash1847
Terhzaz S Cabrero P Robben JH Radford JC Hudson BD Milligan GDow JA Davies SA 2012 Mechanism and function of Drosophilacapa GPCR a desiccation stress-responsive receptor with functionalhomology to human neuromedinU receptor PLoS One 7e29897
Terhzaz S Overend G Sebastian S Dow JA Davies SA 2014 The Dmelanogaster capa-1 neuropeptide activates renal NF-kB signalingPeptides 53218ndash224
Terhzaz S Teets NM Cabrero P Henderson L Ritchie MG Nachman RJDow JA Denlinger DL Davies SA 2015 Insect capa neuropeptidesimpact desiccation and cold tolerance Proc Natl Acad Sci U S A1122882ndash2887
Thorat LJ Gaikwad SM Nath BB 2012 Trehalose as an indicator ofdesiccation stress in Drosophila melanogaster larvae a potentialmarker of anhydrobiosis Biochem Biophys Res Commun419638ndash642
Tobler R Franssen SU Kofler R Orozco-terWengel P Nolte VHermisson J Schl eurootterer C 2014 Massive habitat-specific ge-nomic response in D melanogaster populations during exper-imental evolution in hot and cold environments Mol Biol Evol31364ndash375
Turner TL Bourne EC Von Wettberg EJ Hu TT Nuzhdin SV 2010Population resequencing reveals local adaptation of Arabidopsislyrata to serpentine soils Nat Genet 42260ndash263
Turner TL Miller PM 2012 Investigating natural variation in Drosophilacourtship song by the evolve and resequence approach Genetics191633ndash642
Vermeulen CJ Soslashrensen P Kirilova Gagalova K Loeschcke V 2013Transcriptomic analysis of inbreeding depression in cold-sensitiveDrosophila melanogaster shows upregulation of the immune re-sponse J Evol Biol 261890ndash1902
Wang B Goode J Best J Meltzer J Schilman PE Chen J Garza D ThomasJB Montminy M 2008 The insulin-regulated CREB coactivatorTORC promotes stress resistance in Drosophila Cell Metab7434ndash444
Wang J Kean L Yang J Allan AK Davies SA Herzyk P Dow JA 2004Function-informed transcriptome analysis of Drosophila renaltubule Genome Biol 5R69
Wang K Li M Hakonarson H 2010 Analysing biological pathways ingenome-wide association studies Nat Rev Genet 11843ndash854
Wang L Jia P Wolfinger RD Chen X Zhao Z 2011 Gene set analysis ofgenome-wide association studies methodological issues and per-spectives Genomics 981ndash8
Weber AL Khan GF Magwire MM Tabor CL Mackay TF Anholt RR2012 Genome-wide association analysis of oxidative stress resistancein Drosophila melanogaster PLoS One 7e34745
Weeks AR McKechnie SW Hoffmann AA 2002 Dissecting adaptiveclinal variation markers inversions and sizestress associations inDrosophila melanogaster from a central field population Ecol Lett5756ndash763
Zhu Y Bergland AO Gonzalez J Petrov DA 2012 Empirical validation ofpooled whole genome population re-sequencing in Drosophila mel-anogaster PLoS One 7e41901
15
signaling in ion and water balance regulated by excretion viathe Malpighian tubules (MTs) and gut absorption has beenshown to impact desiccation resistance in D melanogaster(Kahsai et al 2010 Terhzaz et al 2012 2014 2015) Specificallyknockdown of the Capability (Capa) neuropeptide gene en-hanced desiccation resistance by reductions in respiratorycuticular and excretory water loss while also cross-conferringcold tolerance (Terhzaz et al 2015) Cuticular hydrocarbon(CHC) levels are also implicated a single gene encoding a fattyacid synthase mFAS was shown to impact both desiccationresistance and mate choice in Australian rainforest endemicsDrosophila serrata and Drosophila birchii (Chung et al 2014)In D melanogaster knockdown of CYP4G1 also reduced CHCproduction and impaired survival under low humidity (Qiuet al 2012) Metabolic signaling has also been shown toimpact desiccation resistance and includes components ofthe insulin signaling pathways (Seurooderberg et al 2011 Liu et al2015) Several other single gene studies highlight differentmechanisms such as desiccation avoidance in larvae via no-ciceptors encoded by members of the transient receptorpotential (TRP) aquaporin family (Johnson and Carder2012) cyclic adenosine monophosphate (cAMP)-dependentsignaling protein kinase desi (Kawano et al 2010) and poten-tial tissue protection via trehalose sugar accumulation(Thorat et al 2012)
Collectively these functional approaches highlight thecomplexity of insect water balance strategies in controlledbackgrounds but do not explain the variation observed innatural populations and species Resistance evolves rapidlyand is highly heritable (up to 60) in the cosmopolitan andunusually resistant species D melanogaster whereas heritabil-ity is much lower in less tolerant range-restricted species(Hoffmann and Parsons 1989b Kellermann et al 2009)Multiple genome-wide quantitative trait loci were identifiedfrom mapping lines constructed from a natural D melanoga-ster population suggesting a polygenetic architecture (Foleyand Telonis-Scott 2011) Transcriptomics revealed differentialregulation of thousands of genes in response to desiccation inDrosophila mojavensis (Matzkin and Markow 2009) whileartificial selection for desiccation resistance altered the basalexpression of over 200 genes in D melanogaster (Soslashrensenet al 2007)
Previously we used microarray-based genomic hybridiza-tion to survey allele frequency shifts in experimental evolutionlines of D melanogaster recently derived from the field(Telonis-Scott et al 2012) We documented shifts in over600 loci in response to selection for desiccation resistancefollowing a rapid phenotypic response after only 8 genera-tions of selection This variant identification approach waslimited to highlighting candidate genes and regions andnot actual nucleotide polymorphisms apart from those se-quenced post hoc We now expand on this to further explorethe unresolved natural genomic complexity of desiccationresistance using a high-throughput sequencing Pool-GWAS(genome-wide association study) approach (Bastide et al2013) In a GWAS framework for polygenic traits wheremany small effect genes contribute to a phenotype a largesample size is required for adequate power (Mackay et al
2009) Accordingly we sampled natural genetic variationfrom almost 1000 inseminated wild-caught females andcompared the upper 5 desiccation resistant tail fromalmost 10000 F1 progeny with a control sample chosen ran-domly from the same progeny set Harnessing the power ofPool-GWAS we sequenced a pool of 500 natural ldquoresistantrdquogenomes and compared allele frequencies with a pool of 500ldquorandom controlrdquo genomes
We employed a novel comparative approach with the re-sulting set of candidate loci and those detected in our previ-ous artificial selection study of flies collected several seasonsearlier from the same region of southeastern Australia(Telonis-Scott et al 2012) Cross-study repeatability of lociassociated with complex phenotypes is often low (Sarupet al 2011) and can be influenced by genetic backgroundselection intensity epistatic effects and gene-by-environmentinteractions (Mackay et al 2009 Civelek and Lusis 2014)Multiple genetic solutions appear to underlie the array ofdesiccation responses and we further explored this by inves-tigating overlap at the level of individual genes functionalgene ontology (GO) categories and proteinndashprotein interac-tion (PPI) networks Genetic variants contribute to final phe-notypes by way of ldquointermediate phenotypesrdquo (transcriptprotein and metabolite abundances) and correlationsshould theoretically occur across these biological scales(Civelek and Lusis 2014)
Here we report on a screen of natural nucleotide vari-ants associated with desiccation resistance and using a pow-erful analysis approach demonstrate common cross-studysignatures across different hierarchical levels from gene tonetwork and function We found evidence that some func-tional desiccation candidates may be important in wildpopulations and discovered variants linked to multiple phys-iological responses consistent with the traitrsquos complex under-pinnings at both the physiological and molecular levels
Results
Genome Wide Differentiation for Natural DesiccationResistance
We collected over 1000 D melanogaster isofemale linesfrom southern Australia and after one generation of lab-oratory culture screened over 9000 female progeny fordesiccation resistance The top ~5 desiccation-resistantflies were selected for Illumina sequencing (~500 flies)together with a random sampling of the same numberof flies from the families as the ldquocontrolrdquo pool Ourdesign incorporated a subpooling (technical replicate)strategy to both optimize Pool-seq allele frequencyestimates and control for technical bias on allele fre-quency estimates (see Materials and Methods) For thecontrol and desiccation-resistant samples respectively162Mndash259M and 188Mndash289M raw reads were obtainedper technical replicate and an average of 950 of readsmapped to the reference genome after trimming (mean ofall 10 replicates standard deviation [SD] 05)
Variance in allele frequency among the five technical rep-licates for each pool was low (supplementary fig S1
2
Supplementary Material online mean across control repli-cates 00028 median 00015 mean across resistant replicates00027 median 00016) corresponding to a mean SD of~45 around the mean allele frequency The mean pairwisedifference in allele frequency estimates was 55 and 54 forcontrol and desiccation-resistant replicates respectively con-sistent with previously published estimates (4ndash6 Kofleret al 2016)
For a test subset of 1803 randomly sampled single nucle-otide polymorphisms (SNPs) that showed no significant dif-ferentiation between control and desiccation-resistant poolsthe replicates accounted for the majority of total variance inallele frequency Replicates within pool category (control vsdesiccation resistant supplementary fig S1 SupplementaryMaterial online) explained between 15 and 100 (mean87 median 93) of what was a relatively low total variance(mean 00029 median 00019) The mean concordance cor-relation coefficient was 098 for each pool category higherthan a comparable Drosophila study that reported a correla-tion of 0898 between 2 replicates with 20 individuals(Zhu et al 2012) We therefore combined the technical rep-licates into a single control and resistant pool respectively forfurther analyses After processing the mean coverage depth
across the genome was 487 and 568 per sample respec-tively (equivalent to 097 and 11 per individual in thepools of 500 flies)
Allele Frequency Changes
The allele frequencies of 3528158 SNPs were comparedbetween the 2 pools using Fisherrsquos exact tests with apermutation-based false discovery rate (FDR) correctionchosen based on the top 005 of the simulated nullP values Six hundred forty-eight SNPs were differentiatedbetween desiccation resistant and control pools (fig 1 andsupplementary table S1 Supplementary Material online)The SNPs were nonrandomly distributed across chromo-somes (X2
4 frac14 4312 Plt 00001) highly abundant on theX chromosome (X2
1 frac14 3698 Plt 00001) while SNPs wereunderrepresented on both 2L and 3L (2L X2
1 frac14 641 Plt001 3L X2
1 frac14 397 Plt 005) In the desiccation pool theallelic distribution of the 648 SNPs ranged from low tofixed (4ndash100 fig 1B and supplementary table S1Supplementary Material online) The increase in frequencyof the favored ldquodesiccationrdquo alleles compared with the con-trols ranged from 3 to 41 (median increase 14 supple-mentary table S1 Supplementary Material online) where
Fig 1 (A) Manhattan plot of genome-wide P values for differentiation between desiccation-resistant and random control pools shown for the entiregenome where each point represents one SNP SNPs highlighted in red showed differentiation above the 005 threshold level determined from the nullP value distribution (see text for details) SNPs highlighted in blue were above the 005 threshold and also detected in Telonis-Scott et al (2012) (B)Standing allele frequency in the control pool plotted against the frequency change in the selected pool for the 648 differentiated SNPs (FDR 005)
3
the largest shifts tended to occur in common intermediatefrequencies (fig 1B)
Inversion Frequencies
To test whether inversions were associated with desiccationresistance allele frequencies were checked at SNPs diagnosticfor the following inversions In(3R)P (Anderson et al 2005Kapun et al 2014) In(2L)t (Andolfatto et al 1999 Kapun et al2014) In(2R)NS In(3L)P In(3R)C In(3R)K and In(3R)Mo(Kapun et al 2014) In(3L)P In(3R)K and In(3R)Mo werenot detected in this population and for all other inversionsinvestigated the control and desiccation-resistant pool fre-quencies did not differ more than 6
Genomic Distribution
We examined these candidates using a comprehensive ap-proach encompassing SNP gene functional ontology andnetwork levels First we examined SNP locations with respectto gene structure and putative effects of SNPs on genomefeatures The 648 SNPs mapped to 382 genes and were largelynoncoding (79 supplementary table S2 SupplementaryMaterial online) Coding variants comprised nearly 15 ofcandidate SNPs largely synonymous substitutions and theremainder nonsynonymous missense variants predicted toresult in amino acid substitution with length preservation(9 and 14 supplementary table S2 SupplementaryMaterial online) Based on proportions of features from allannotated SNPs compared with the candidates intron vari-ants were significantly underrepresented (X2
1 frac14706 Plt 001supplementary table S2 Supplementary Material online)while SNPs were enriched in 50UTRs 50UTR premature startcodon sites and splice regions (X2
1 frac14 820 Plt 001X2
1 frac14 3124 Plt 00001 X21 frac14 640 Plt 005 respectively
supplementary table S2 Supplementary Material online)
GO Functional Enrichment
For the GO analyses we found no overrepresented termsusing the stringent Gowinda gene mode but identified sev-eral categories using SNP mode with a 005 FDR thresholdGO terms were enriched for the broad categories of ldquochro-matinrdquo ldquochromosomerdquo and more specifically ldquodouble-stranded RNA bindingrdquo although significance of this categorywas due solely to nine differentiated SNPs in the DIP1 geneThe term ldquogamma-secretase complexrdquo was also enriched dueto three SNPs in the pen-2 gene
Desiccation Candidate Genes
Chown et al (2011) comprehensively summarized the pri-mary mechanisms underpinning desiccation resistancefrom the critical first phase of stress sensing and behavioralavoidance to the manifold physiological trajectories availableto counter water stress These include water content varia-tion water loss rates (desiccation resistance) and tolerance ofwater loss (dehydration tolerance) We expanded on thisreview to incorporate recent molecular research and summa-rized the primary mechanisms and known candidate genesrelevant to D melanogaster stress sensing and water balance
(table 1 and references therein) Informed by this body ofwork we applied a ldquotop-downrdquo mechanistic approach toscreening desiccation candidate genes in our data (egfrom initial hygrosensing to downstream key physiologicalpathways table 1) We did not observe differentiation inSNPs mapping to the TRP ion channels the key hygrosensingreceptor genes However we did map SNPs to genes involvedin stress sensing specific to the MT fluid secretion signalingpathways including two of the six cAMP and cyclic guanosinemonophosphate (cGMP) signaling phosphodiesterases dncand pde9 as well as klu indicated in NF-kB ortholog signalingin the renal tubules (table 1) A number of the stress-sensingSAPK pathway genes were differentiated (table 1) Key genesinvolved in the insulin signaling pathway implicated in met-abolic homeostasis and water balance were also differenti-ated including the insulin receptor InR and insulin receptorbinding Dok (table 1)
Network Analysis Modeling Genomic Differentiationfor Desiccation in a PPI Context
Given the genomic complexity underpinning desiccation re-sistance we also implemented a higher-order analysis to pro-vide an overview of potential architectures beyond the SNPand gene levels using PPI networks The SNPs mapped to 382genes which were annotated to 307 seed proteins occurring inthe full background D melanogaster PPI network The seedproteins produced a large first-order network with 3324nodes and 51299 edges
To determine if broader functional signatures could beascertained from the complex network we employed func-tional enrichment analyses on the 3324 nodes (table 2 andsupplementary table S3 Supplementary Material online)Strikingly the network was enriched for pathways involvedin gene expression from transcription through RNA process-ing to RNA transport metabolism and decay (table 2)The network analysis also revealed further complexity ofthe stress response with enrichment of proteins involved inregulating innate insect immunity (table 2) where multiplepathways were overrepresented including cytokine interleu-kin and toll-like receptor signaling cascades (table 2)Developmentalgene regulation signaling pathways were sig-nificant in the pathway analysis the KEGG database showedenrichment for Notch and Mitogen-activated protein kinase(MAPK) signaling while insulin and NGF pathways were high-lighted from the Reactome database (table 2)
Common Signatures of Desiccation Resistance acrossStudies
The key finding of this study is the degree of genomic overlapwith our previous mapping of alleles associated with artifi-cially selected desiccation phenotypes (Telonis-Scott et al2012) Although the flies were sampled from comparablesoutheastern Australian locations the study designs were dif-ferent Specifically the earlier study focused on resistancegenerated by artificial selection whereas this study focusedon natural variation in resistance Despite the differences instudy design we identified 45 genes common to both studies
4
Table 1 Summary of Primary Molecular Responses and Candidate Genes for Desiccation Response Mechanisms Identified in the Literature
Mechanism Molecular Function Genes Current Study(F1 wild-caught)
Telonis-Scott et al(2012) (F8 artificial
selection)
Hygrosensing Sensory moisture receptors (antennae legs gut mouthparts)
Temperature-activatedtransient receptor po-tential ionchannelsbcd
TRP ion channels trpl trp Trpg TrpA1 TrpmTrpml pain pyx wtrwnompC iav nan Amo
mdash trpl
Stress sensing Malpighian tubules (principal cells)
Tubule fluid secretion signaling pathways
cAMP signalingefg cAMP activation Dh44-R2 mdash mdash
cGMP signalingefg cGMP activation Capa CapaRcedilNplp1-4 mdash mdash
Receptors Gyc76c CG33958 CG34357 mdash CG34357
Kinases dg1-2 mdash mdash
cAMP + cGMPsignalingefgh
Hydrolyzing phosphodiesterases dnc pde1c pde6 pde9 pde11 dnc pde9 dnc pde1c pde9 pde11
Calcium signalingeg Calcium-sensitive nitric oxidesynthase
Nos mdash Nos
Desiccation specific Diuretic neuropeptide NFkBsignalingijk
Capa CapaR trpl Relish klu klu klu trpl
Anti-diureticneuropeptidel ITP sNPF Tk mdash ITP sNPF
Stress sensing Stress responsive pathway
Stress activatedprotein kinasepathwaymn
MAPKJun kinase
Aop Aplip1 Atf-2 bsk Btk29Acbt Cdc 42 Cka cno CYLDDok egr Gadd45 hep hppyJra kay medo Mkk4 msnMtl Pak puc pyd Rac 1-2Rho1 shark slpr Src42ASrc64B Tak1 Traf6
Aop Dokhepcedilkaymsn Mtlpuc Rac1Src64B
Mtl slpr
Metabolic homeostasis and water balance
Components of insu-lin signaling pathway
Receptorop InR InR mdash
Receptor binding chico IIp1-8 dock Dok poly Dok Poly
Resistance mechanisms Water loss barriers
Cuticularhydrocarbons
Fatty acid synthasesq desatura-sesr transporters oxidativedecarbonylaset
CG3523 CG3524 desat1-2 FatPCyp4g1
CG3523 mdash
Primary hemolymph sugartissue-protectant
Trehalose metabolism Trehalose-phosphatasebu treh-lase activitybu phosphoglyc-erate mutase
Tps1 Treh Pgm CG5171CG5177 CG6262 crc
Treh Pgm mdash
NOTEmdashCandidate genes differentiated in the current study and in Telonis-Scott et al (2012) are shown and are underlined where overlappingaFor brevity several comprehensive reviews are cited and readers are referred to references therein Note that water budgeting via discontinuous gas exchange was not includeddue to a lack of ldquobona fiderdquo molecular candidates and desiccation tolerance due to aquaporins was omitted as genes were not differentiated in either of our studiesbChown et al (2011)cJohnson and Carder (2012)dFowler and Montell (2013)eDay et al (2005)fDavies et al (2012)gDavies et al (2013)hDavies et al (2014)iTerhzaz et al (2012)jTerhzaz et al (2014)kTerhzaz et al (2015)lKahsai et al (2010)mHuang and Tunnacliffe (2004)nHuang and Tunnacliffe (2005)oSeurooderberg et al (2011)pLiu et al (2015)qChung et al (2014)rSinclair et al (2007)rFoley and Telonis-Scott (2011)tQiu et al (2012)uThorat et al (2012)
5
Conserved Genomic Signatures of Drosophila Desiccation Resistance doi101093molbevmsv349 MBE by guest on February 6 2016
httpmbeoxfordjournalsorg
Dow
nloaded from
out of the 382 and 416 genes for the current and 2012 studyrespectively (Fisherrsquos exact test for overlap P lt 2 1016odds ratio frac14 590 supplementary table S4 SupplementaryMaterial online) Bearing gene length bias in mind we exam-ined the average length of genes in both studies as well as thegene overlap to the average gene length in the Drosophilagenome (supplementary fig S3 Supplementary Materialonline) Although there was bias toward longer genes in theoverlap list this was also apparent in the full gene lists fromboth studies
The overlap genes mapped to all of the major chromo-somes and their differentiated variants were mostly noncod-ing SNPs located in introns (supplementary table S4Supplementary Material online) Two SNPs were synonymous(Strn-Mlick coding for a protein kinase and Cht7 chitinbinding supplementary table S4 Supplementary Materialonline) while a 30UTR variant was predicted to putativelyalter protein structure in a nondisruptive manner mostlikely impacting expression andor protein translational effi-ciency (jbug response to stimulus supplementary table S4Supplementary Material online) One nonsynonymous SNPwas predicted to alter protein function as a missense variant(fab1 signaling supplementary table S4 SupplementaryMaterial online) The increase in frequency of the favoreddesiccation alleles compared with the controls ranged from5 to 27 with a median increase of 164
Despite the strong overlap there were important differ-ences between the two studies For example our previouswork revealed evidence of a selective sweep in the 50 pro-moter region of the Dys gene in the artificially selected linesand we confirmed a higher frequency of the selected SNPs inan independent natural association study in resistant fliesfrom Coffs Harbour (Telonis-Scott et al 2012) In this studywe found no differentiation at the Dys locus between ourresistant and random controls However both resistant andcontrol pools had a high frequency of the selected allele de-tected in Telonis-Scott et al (2012) (supplementary table S5Supplementary Material online)
The 45 genes common to both studies were analyzed forbiological function while many genes are obviously pleiotro-pic (ie assigned multiple GO terms) almost half were in-volved in response to stimuli (cellular behavioral andregulatory) including signaling such as Rho protein signaltransduction and cAMP-mediated signaling (table 3) Othercategories of interest include spiracle morphogenesis sensoryorgan development including development of the compoundeye and stem cell and neuroblast differentiation (table 3)
With regard to specific stress responsedesiccation candi-dates (table 1) 4 of the 45 genes overlapped including thecAMPcGMP signaling phosphodiesterases pde9 and dncMAPK pathway genes Mtl and klu indicated in NF-kBorthologue signaling in the renal tubules (table 1)Furthermore there was overlap among mechanistic catego-ries where different genes in the same pathways weredetected For example additional key cAMPcGMP hydrolyz-ing phosphodiesterases pde1c and pde11 were significant inTelonis-Scott et al (2012) as well as slpr (MAPK pathway)and poly (insulin signaling) (table 1)
Table 2 Pathway Enrichment of the GWAS and Telonis-Scott et alFirst-Order PPI Networks Investigated Using FlyMine (FDR P lt 005)
Current Study GWAS Telonis-Scott et al(2012)
Database Pathway Pathway
KEGG Ribosome RibosomeNotch signaling pathway Notch signaling pathwayMAPK signaling pathwayDorso-ventral axis formationProgesterone-mediated oocyte
maturationWnt signaling pathwayEndocytosisProteasome
Reactome Immune system Immune systemAdaptive immune system Adaptive immune systemInnate immune system Innate immune systemSignaling by interleukins Signaling by interleukins(Toll-like receptor cascades) Toll-like receptor
cascadesTCR signaling(B cell receptor signaling)
Gene expression Gene expressionMetabolism of RNA Metabolism of RNANonsense-mediated decay Nonsense-mediated decayTranscriptionmRNA processing(RNA transport)Metabolism of proteins (Metabolism of
proteins)Translation (Translation)Cell cycle Cell cycle
Cell cycle checkpointsRegulation of mitotic cell
cycleMitotic metaphase and
anaphaseG0 and early G1 G0 and early G1
G1 phase(G1S transition)G2M transition(DNA replicationS
phase)DNA repairDouble-strand break
repairSignaling pathways Signaling pathwaysInsulin signaling pathwaySignaling by NGF Signaling by NGF
Signaling by FGFRSignaling by NOTCHSignaling by WntSignaling by EGFRDevelopmentAxon guidanceMembrane traffickingGap junction trafficking
and regulationProgramed cell deathApoptosisHemostasisHemostasis
NOTEmdashKEGG and Reactome pathway databases were queried All significant termsare reported with Reactome terms grouped together under parent terms in italictext (full list of terms available in supplementary table S3 Supplementary Materialonline) Parent terms without brackets are the names of pathways that were sig-nificantly enriched parent terms in parentheses were not listed as significantlyenriched themselves but are reported as a guide to the category contents
6
In addition to the significant gene-level overlap betweenthe two studies we observed significant overlap between thePPI networks constructed from each set of candidate genes(summarized in fig 2) We examined the overlap by compar-ing the first-order networks 1) at the level of node overlap and2) at the level of edge (interaction) overlap Three hundredseven GWAS seeds formed a network of 3324 nodes whileour previous set of candidates mapped to 337 seeds forming anetwork that comprised 3215 nodes One thousand eighthundred twenty-six or ~55 of the nodes overlapped be-tween networks which was significantly higher than the nulldistribution estimated from 1000 simulations (mean simu-lated overlap SD 1304 143 nodes observed overlap atthe 999th percentile fig 2A and C) The overlap at the inter-action level was not significantly higher than expected (meansimulated overlap SD 25704 4024 edges observed over-lap 30686 edges 89th percentile fig 2B and D) Hive plotswere used to visualize the separate networks (supplementaryfig S4A and B Supplementary Material online) as well as theoverlap between the networks (supplementary fig S4CSupplementary Material online)
Given the high degree of shared nodes from distinct setsof protein seeds we compared the networks for biologicalfunction (fig 2E) Both networks produced long lists of sig-nificantly enriched GO terms (supplementary table S3Supplementary Material online) and assessing overlap was
difficult due to the nonindependent nature of GO hierar-chies However a comparison of functional pathway enrich-ment revealed some differences Although both networksshowed enrichment in the immune system pathways theribosome and Notch and NGF signaling pathways as well assome involvement of gene expression and translation path-ways the network constructed from the Telonis-Scott et al(2012) gene list showed extensive involvement of cell cyclepathway proteins as well as significant enrichment for Wntsignaling EGRF signaling DNA double-strand break repairaxon guidance gap junction trafficking and apoptosis path-ways which were not enriched in the GWAS network TheGWAS network was specifically enriched for dorso-ventralaxis formation oocyte maturation and many more geneexpression pathways than the Telonis-Scott et al (2012)gene network (table 2 and supplementary table S3Supplementary Material online)
Discussion
Genomic Signature of Desiccation Resistance fromNatural Standing Genetic Variation
We report the first large-scale genome-wide screen for nat-ural alleles associated with survival of low humidity condi-tions in D melanogaster By combining a powerful naturalassociation study with high-throughput Pool-seq we
Table 3 Overrepresentation of GO Categories for the 45 Core Candidate Genes Overlapping between the Current and Our Previous(Telonis-Scott et al 2012) Genomic Study
Term ID TermSubterms FDR P valuea Genes
Cellularbehavioral response to stimulus defense and signaling
GO0050896 Response to stimulus (cellular response tostimulus regulation of response tostimulus)
0007 Abd-B dnc fz mam nkd slo ct fru kluRbf klg santa-maria jbug fab1CG33275 Mtl Trim9 cv-c Gprk2 Ect4CG43658
GO0048585 Negative regulation of response to stimu-lus (Rho protein signal transductionsignal transduction)
0042 dnc mam slo fru Trim9 cv-c
GO0023052 Signaling (cAMP-mediated signaling cellcommunication)
0031 Amph Pde9 dnc fz mam nkd ct kluRbf santa-maria fab1 CG33275 MtlTrim9 cv-c Gprk2 Ect4 CG43658
Development
GO0048863 Stem cell differentiation 0014 Abd-D mam nkd klu nub
GO0014016 Neuroblast differentiation 0048
GO0045165 Cell fate commitment 0010 Abd-D dnc fz mam nkd ct klu klg nub
GO0035277 Spiracle morphogenesis open trachealsystem
0010 Abd-D ct cv-c
GO0007423 Sensory organ development (eye wingdisc imaginal disc)
0008 fz mam ct klu klg sdk Amph Nckx30cMtl
GO0048736 Appendage development 0020 fz mam ct fru CG33275 Mtl Fili nubcv-c Gprk2 CG43658
GO0007552 Metamorphosis 0001 fz mam ct fru CG33275 Mtl Fili cv-cGprk2 CG43658
Cellular component organization
GO0030030 Cell projection organization 0020 dnc fz ct fru GluClalpha Nckx30c MtlTrim9 nub cv-c
NOTEmdashNote that most genes are assigned to multiple annotations and therefore are shown purely as a guide to putative functionaBonferronindashHochberg FDR and gene length corrected P values using the Flymine v400 database Gene Ontology Enrichment The terms were condensed and trimmed usingREVIGO
7
mapped SNPs associated with desiccation resistance in 500naturally derived genotypes Although Pool-seq has limitedpower to reveal very low frequency alleles to reconstructphase or adequately estimate allele frequencies in smallpools (Lynch et al 2014) the Pool-GWAS approach hasbeen validated by retrieval of known candidate genes in-volved in body pigmentation using similar numbers of ge-notypes as assessed here (Bastide et al 2013)
Although studies in controlled genetic backgrounds reportlarge effects of single genes on desiccation resistance (Table 1)our current and previous screens revealed similarly largenumbers of variants (over 600) mapping to over 300 genesA polygenic basis of desiccation resistance is consistent withquantitative genetic data for the trait (Foley and Telonis-Scott2011) and we also anticipate more complexity from naturalD melanogaster populations with greater standing geneticvariation than inbred laboratory strains However somefalse positive associations are likely in the GWAS due to thelimitations of quantitative trait mapping approaches Pooledevolve-and-resequence experiments routinely yield vast num-bers of candidate SNPs partly due to high levels of standinglinkage disequilibrium (LD) and hitchhiking neutral alleles
(Schleurootterer et al 2015) Large population sizes from specieswith low levels of LD such as D melanogaster somewhat cir-cumvent this issue and a notable feature of our experimentaldesign is our sampling Here we selected the most desicca-tion-resistant phenotypes directly from substantial standinggenetic variation where one generation of recombinationshould largely reflect natural haplotypes
We observed the largest candidate allele frequency differ-ences in loci that had intermediate allele frequencies in thecontrol pool This is in line with population genetics modelswhich predict that evolution from standing genetic variationwill initially be strongest for intermediate frequency alleleswhich can facilitate rapid change (Long et al 2015) Howeverwe also observed a number of low frequency alleles which ifbeneficial can inflate false positives due to long-range LD(Orozco-terWengel et al 2012 Nuzhdin and Turner 2013Tobler et al 2014) Nonetheless spurious LD associationscannot fully account for our candidate list as evidenced bythe desiccation functional candidates and cross-study over-lap Furthermore chromosome inversion dynamics can alsogenerate both short and long-range LD and contribute nu-merous false positive SNP phenotype associations (Hoffmann
Fig 2 Observed versus simulated network overlap between the GWAS first-order PPI network and the first-order PPI network from the Telonis-Scottet al (2012) gene list Overlap is presented in terms of (A) node number and (B) edge number Networks were simulated by resampling from the entireDrosophila melanogaster gene list (r553) as described in the text The histogram shows the distribution of overlap measures for 1000 simulations Theblack line represents the histogram as density and the blue line shows the corresponding normal distribution The red vertical line shows the observedoverlap from the real networks labeled as a percentile of the normal distribution Area-proportional Venn diagrams summarized the extent of overlapfor PPI networks constructed separately from the GWAS candidates and Telonis-Scott et al (2012) candidates (labeled ldquoprevious studyrdquo) (C) Overlapexpressed as the number of nodes Approximately 55 of the nodes overlapped between the two studies which was significantly higher than simulatedexpectations (999 percentile 2B) (D) Overlap expressed in the number of interactions This was not significantly higher than expected by simulation(89th percentile 2B) (E) Overlap at the level of GO biological process terms (directly compared GO terms by name this was not tested statisticallybecause of the complex hierarchical nature of GO terms and is presented for interest)
8
and Weeks 2007 Tobler et al 2014) but we found little as-sociation between the desiccation pool and inversion fre-quency changes consistent with the lack of latitudinalpatterns in desiccation resistanceand traitinversion fre-quency associations in east Australian D melanogaster(Hoffmann et al 2001 Weeks et al 2002)
Cross-Study Comparison Reveals a Core Set ofCandidates for a Complex Trait
Our data contain many candidate genes functions and path-ways for insect desiccation resistance but partitioning genu-ine adaptive loci from experimental noise remains asignificant challenge We therefore focus our efforts on ournovel comparative analysis where we identified 45 robustcandidates for further exploration Common genetic signa-tures are rarely documented in cross-study comparisons ofcomplex traits (Sarup et al 2011 Huang et al 2012 but cfVermeulen et al 2013) and never before for desiccation re-sistance Some degree of overlap between our studies likelyresults from a shared genetic background from similar sam-pling sites highlighting the replicability of these responsesOne limitation of this approach is the overpopulation ofthe ldquocorerdquo list by longer genes This result reflects the genelength bias inherent to GWAS approaches an issue that is notstraightforward to correct in a GWAS let alone in the com-parative framework applied here A focus on the commongenes functionally linked to the desiccation phenotype canprovide a meaningful framework for future research and alsosuggests that some large-effect candidate genes exhibit natu-ral variation contributing to the phenotype (table 1)
Other feasible candidates include genes expressed inthe MTs (Wang et al 2004) or essential for MT develop-ment (St Pierre et al 2014) providing further researchopportunity focused on the primary stress sensing andfluid secretion tissues Genes include one of the mosthighly expressed MT transcripts CG7084 as well as dncNckx30C slo cv-c and nkd Other aspects of stress resis-tance are also implicated CrebB for example is involvedin regulating insulin-regulated stress resistance and ther-mosensory behavior (Wang et al 2008) nkd also impactslarval cuticle development and shares a PDZ signalingdomain also present in the desiccation candidate desi(Kawano et al 2010 St Pierre et al 2014) Finally severalcore candidates were consistently reported in recent stud-ies suggesting generalized roles in stress responses andfitness including transcriptome studies on MT function(Wang et al 2004) inbreeding and cold resistance(Vermeulen et al 2013) oxidative stress (Weber et al2012) and genomic studies exploring age-specific SNPson fitness traits (Durham et al 2014) and genes likelyunder spatial selection in flies from climatically divergenthabitats (Fabian et al 2012)
Several of the enriched GO categories from the core listalso make biological sense including stimulus response(almost half the suite) defense and signaling particularlyin pathways involved in MT stress sensing (Davies et al 20122013 2014) Functional categories enriched for sensory
organ development were highlighted consistent with envi-ronmental sensing categories from desert Drosophila tran-scriptomics (Matzkin and Markow 2009 Rajpurohit et al2013) and stress detection via phototransduction pathwaysin D melanogaster under a range of stressors including des-iccation (Soslashrensen et al 2007)
Evidence for Conserved Signatures at Higher LevelOrganization
Beyond the gene level we observed a striking degree ofsimilarity between this study and that of Telonis-Scottet al (2012) at the network level where significantly over-lapping protein networks were constructed from largely dif-ferent seed protein sets This supports a scenario where theldquobiological information flow from DNA to phenotyperdquo(Civelek and Lusis 2014) contains inherent redundancywhere alternative genetic solutions underlie phenotypesand functions Resistance to perturbation by genetic or en-vironmental variation appears to increase with increasinghierarchical complexity that is phenotype ldquobufferingrdquo (Fuet al 2009) The same functional network in different pop-ulationsgenetic backgrounds can be impacted but theexact genes and SNPs responding to perturbation will notnecessarily overlap In Drosophila different sets of loci fromthe same founders mapped to the same PPI network in thecase of two complex traits chill coma recovery and startleresponse (Huang et al 2012) Furthermore interspecificshared networkspathways from different genes have beendocumented for numerous complex traits (Emilsson et al2008 Aytes et al 2014 Ichihashi et al 2014) providing op-portunities for investigating taxa where individual genefunction is poorly understood
Although the null distribution is not always obviouswhen comparing networks and their functions partly be-cause of reduced sensitivity from missing interactions(Leiserson et al 2013) here the shared network signaturesportray a plausible multifaceted stress response involvingstress sensing rapid gene accessibility and expression Thepredominance of immune system and stress responsepathways are of interest with reference immunitystresspathway ldquocross-talkrdquo particularly in the MTs whereabiotic stressors elicit strong immunity transcriptionalresponses (Davies et al 2012) The higher order architec-tures reflect crucial aspects of rapid gene expression con-trol in response to stress from chromatin organization(Alexander and Lomvardas 2014) to transcription RNAdecay and translation (reviwed in de Nadal et al 2011) InDrosophila variation in transcriptional regulation of indi-vidual genes can underpin divergent desiccation pheno-types (Chung et al 2014) or be functionally inferred frompatterns of many genes elicited during desiccation stress(Rajpurohit et al 2013) Variants with potential to mod-ulate expression at different stages of gene expressioncontrol were evident in our GWAS data alone for exam-ple enrichment for 50UTR and splice variants at the SNPlevel and for chromatin and chromosome structure termsat the functional level
9
Genetic Overlap Versus Genetic differencesmdashCausesand Implications
Despite the common signatures at the gene and higher orderlevels between our studies pervasive differences were ob-served These can arise from differences in standing geneticvariation in the natural populations where different alleliccompositions impact gene-by-environment interactions andepistasis Our network overlap analyses highlight the potentialfor different molecular trajectories to result in similar func-tions (Leiserson et al 2013) consistent with the multiplephysiological pathways potentially available to D melanoga-ster under water stress (Chown et al 2011) Experimentaldesign also presumably contributed to the study differencesStrong directional selection as applied in the 2012 study canincrease the frequency of rare beneficial alleles and likely re-duced overall diversity than did the single-generation GWASdesign Furthermore selection regimes often impose addi-tional selective pressures (eg food deprivation during desic-cation) resulting in correlated responses such as starvationresistance with desiccation resistance and these factors differfrom the laboratory to the field Finally evidence suggests innatural D melanogaster that standing genetic variation canvary extensively with sampling season (Itoh et al 2010Bergland et al 2014) Although we cannot compare levelsof standing variation between our two studies evidence sug-gests that this may have affected the Dys gene where ourGWAS approach revealed no differentiation between thecontrol and resistant pools at this locus but rather detectedthe alleles associated with the selective sweep in Telonis-Scottet al (2012) in both pools at high frequency Whether thisresulted from sampling the later collection after a longdrought (2004ndash2008 wwwbomgovauclimate last accessedJanuary 13 2016) is unclear but this observation highlightsthe relevance of temporal sampling to standing genetic var-iation when attempting to link complex phenotypes togenotypes
Materials and Methods
Natural Population Sampling
Drosophila melanogaster was collected from vineyard waste atKilchorn Winery Romsey Australia in April 2010 Over 1200isofemale lines were established in the laboratory from wild-caught females Species identification was conducted on maleF1 to remove Drosophila simulans contamination and over900 D melanogaster isofemale lines were retained All flieswere maintained in vials of dextrose cornmeal media at25 C and constant light
Desiccation Assay
From each isofemale line ten inseminated F1 females weresorted by aspiration without CO2 and held in vials until 4ndash5days old For the desiccation assay 10 females per line (over9000 flies in total) were screened by placing groups of femalesinto empty vials covered with gauze and randomly assigninglines into 5 large sealed glass tanks containing trays of silicadesiccant (relative humidity [RH] lt10) The tanks werescored for mortality hourly and the final 558 surviving
individuals were selected for the desiccation-resistant pool(558 flies lt6 of the total) The control pool constitutedthe same number of females sampled from over 200 ran-domly chosen isofemale lines that were frozen prior to thedesiccation assay We chose this random control approachinstead of comparing the phenotypic extremes (ie the latemortality tail vs the ldquoearly mortalityrdquo tail) because unfavor-able environmental conditions experienced by the wild-caught dam may inflate environmental variance due to car-ryover effects in the F1 offspring (Schiffer et al 2013)
DNA Extraction and Sequencing
We used pooled whole-genome sequencing (Pool-seq)(Futschik and Schleurootterer 2010) to estimate and compareallele frequency differences between the most desiccation-resistant and randomly sampled flies from a wild populationto associate naturally segregating candidate SNPs with theresponse to low humidity For each treatment (resistantand control) genomic DNA was extracted from 50 femaleheads using the Qiagen DNeasy Blood and Tissue Kit accord-ing to the manufacturerrsquos instructions (Qiagen HildenGermany) Two extractions were combined to create fivepools per treatment which were barcoded separately tocreate technical replicates (total 10 pools of 100 fliesn frac14 1000) The DNA was fragmented using a CovarisS2 ma-chine (Covaris Inc Woburn MA) and libraries were pre-pared from 1 mg genomic DNA (per pool of 100 flies) usingthe Illumina TruSeq Prep module (Illumina San Diego CA)To minimize the effect of variation across lanes the ten poolswere combined in equal concentration (fig 3) and run mul-tiplexed in each lane of a full flow cell of an Illumina HiSeq2000 sequencer Clusters were generated using the TruSeq PECluster Kit v5 on a cBot and sequenced using Illumina TruSeqSBS v3 chemistry (Illumina) Fragmentation library construc-tion and sequencing were performed at the Micromon Next-Gen Sequencing facility (Monash University ClaytonAustralia)
Fig 3 Experimental design See text for further explanation
10
Data Processing and Mapping
Processing was performed on a high-performance computingcluster using the Rubra pipeline system that makes use of theRuffus Python library (Goodstadt 2010) The final variant-calling pipeline was based on a pipeline developed by ClareSloggett (Sloggett et al 2014) and is publicly available athttpsgithubcomgriffinpGWAS_pipeline (last accessedJanuary 13 2016) Raw sequence reads were processed persample per lane with P1 and P2 read files processed sepa-rately initially Trimming was performed with trimmomatic v030 (Lohse et al 2012) Adapter sequences were also re-moved Leading and trailing bases with a quality scorebelow 30 were trimmed and a sliding window (width frac14 10quality threshold frac14 25) was used Reads shorter than 40 bpafter trimming were discarded Quality before and after trim-ming was examined using FastQC v 0101 (Andrews 2014)
Posttrimming reads that remained paired were used asinput into the alignment step Alignment to the D melano-gaster reference genome version r553 was performed withbwa v 062 (Li and Durbin 2009) using the bwa aln commandand the following options No seeding (-l 150) 1 rate ofmissing alignments given 2 uniform base error rate (-n001) a maximum of 2 gap opens per read (-o 2) a maximumof 12 gap extensions per read (-e 12) and long deletionswithin 12 bp of 30 end disallowed (-d 12) The default valueswere used for all other options Output files were then con-verted to sorted bam files using the bwa sampe commandand the ldquoSortSamrdquo option in Picard v 196
At this point the bam files were merged into one file persample using PicardMerge Duplicates were identified withPicard MarkDuplicates The tool RealignerTargetCreator inGATK v 26-5 (DePristo et al 2011) was applied to the tenbam files simultaneously producing a list of intervals contain-ing probable indels around which local realignment would beperformed This was used as input for the IndelRealigner toolwhich was also applied to all ten bam files simultaneously
To investigate the variation among technical replicatesallele frequency was estimated based on read counts foreach technical replicate separately for 100000 randomly cho-sen SNPs across the genome After excluding candidate des-iccation-resistance SNPs and SNPs that may have been falsepositives due to sequencing error (those with mean frequencylt004 across replicates) 1803 SNPs remained For each locuswe calculated the variance in allele frequency estimate amongthe five control replicates and the variance among the fivedesiccation-resistance replicates We compared this with thevariance due to pool category (control vs desiccation resis-tant) using analysis of variance (ANOVA) We also calculatedthe mean pairwise difference in allele frequency estimateamong control and among desiccation-resistance replicatesand the mean concordance correlation coefficient was calcu-lated over all pairwise comparisons within pool category(using the epiccc function in the epiR package Stevensonet al 2015)
The following approach was then taken to identify SNPsdiffering between the desiccation-resistant and controlgroups Based on the technical replicate analysis we merged
the five ldquodesiccation-resistantrdquo and the five random controlsamples into two bam files A pileup file was created withsamtools v 0119 (Li et al 2009) and converted to a ldquosyncrdquo fileusing the mpileup2sync script in Popoolation2 v 1201 (Kofleret al 2011) retaining reads with a mapping quality of at least20 and bases with a minimum quality score of 20 Reads withunmapped mates were also retained (option ndashA in samtoolsmpileup) For this step we used a repeat-masked version ofthe reference genome created with RepeatMasker 403 (Smitet al 2013-2015) and a repeat file containing all annotatedtransposons from D melanogaster including shared ancestralsequences (Flybase release r557) plus all repetitive elementsfrom RepBase release v1904 for D melanogaster Simple re-peats small RNA genes and bacterial insertions were notmasked (options ndashnolow ndashnorna ndashno_is) The rmblastsearch engine was used
Association Analysis
We then performed Fisherrsquos exact test with the Fisher testscript in Popoolation2 (Kofler et al 2011) We set the min-imum count (of the minor allele) frac14 10 minimum coverage(in both populations) frac14 20 and maximum coverage frac1410000 The test was performed for each SNP (slidingwindow off) Although there are numerous approaches tocalculating genome-wide significance thresholds in aPoolseq framework currently there is no consensus on anappropriate standardized statistical method Methods in-clude simulation approaches (Orozco-terWengel et al2012 Bastide et al 2013 Burke et al 2014) correctionusing a null distribution based on estimating genetic driftfrom replicated artificial selection line allele frequency vari-ance (Turner and Miller 2012) use of a simple thresholdbased on a minimum allele frequency difference (Turneret al 2010) and the FDR correction suggested by Storeyand Tibshirani (2003) (used by Fabian et al 2012)
We determined based on our design to calculate the FDRthreshold by comparison with a simulated distribution of nullP values following the approach of Bastide et al (2013) Foreach SNP locus a P value was calculated by simulating thedistribution of read counts between the major and minorallele according to a beta-binomial distribution with meana(a+b) and variance (ab)((a+b)2(a+b+1)) Thismodel allowed for two types of sampling variationStochastic variation in allele sampling across the phenotypesand the background variation in the representation of allelesin each pool caused by library preparation artifacts or sto-chastic variation driven by unequal coverage of the poolsThese sources of variation have also been recognized else-where as being important in interpreting pooled sequenceallele calls and identifying significant differentiation betweenpooled samples (Lynch et al 2014) The value of afrac14 37 waschosen by chi-square test as the best match to the observedP value distribution (afrac14 10 to afrac14 40 were tested) with acorresponding bfrac14 4315 to reflect the coverage depth differ-ence between the random control (487) and desiccation-resistant (568) pools These values were then used tosimulate a null data set 10 the size of the observed data
11
set Because the null P value distribution still did not fit theobserved P value distribution particularly well (supplemen-tary fig S2 Supplementary Material online) these values werenot used for a standard FDR correction Instead a P valuethreshold was calculated based on the lowest 005 of thenull P values similar in approach to Orozco-terWengel et al(2012)
Inversion Analysis
In D melanogaster several cosmopolitan inversion polymor-phisms show cross-continent latitudinal patterns that explaina substantial proportion of clinal variation for many traitsincluding thermal tolerance (reviewed in Hoffmann andWeeks 2007) Patterns of disequilibria generated betweenloci in the vicinity of the rearrangement can obscure signalsof selection (Hoffmann and Weeks 2007) and inversion fre-quencies are routinely considered in genomic studies ofnatural populations sampled from known latitudinal clines(Fabian et al 2012 Kapun et al 2014) We assessedassociations between inversion frequencies and our controland resistant pools at SNPs diagnostic for the following in-versions In3R(Payne) (Anderson et al 2005 Kapun et al2014) In(2L)t (Andolfatto et al 1999 Kapun et al 2014)In(2R)Ns In(3L)P In(3R)C In(3R)K and In(3R)Mo (Kapunet al 2014) Between one and five diagnostic SNPs were in-vestigated for each inversion (depending on availability) Weconsidered an inversion to be associated with desiccationresistance if the frequency of its diagnostic SNP differed sig-nificantly between the control and desiccation-resistant poolsby a Fisherrsquos exact test
Genome Feature Annotation
The list of candidate SNPs was converted to vcf format using acustom R script setting the alternate allele as the allele thatincreased in frequency from the control to the desiccation-resistant pool and setting the reference allele as the otherallele present in cases where both the major and minor allelesdiffered from the reference genome The genomic features inwhich the candidate SNPs occurred were annotated usingsnpEff v40 first creating a database from the D melanogasterv553 genome annotation with the default approach(Cingolani et al 2012) The default annotation settings wereused except for setting the parameter ldquo-ud 0rdquo to annotateSNPs to genes only if they were within the gene body Using2 tests on SNP counts we tested for over- or under-enrichment of feature types by calculating the proportionof lsquoallrsquo features from all SNPs detected in the study and theproportion of ldquocandidaterdquo SNPs identified as differentiated indesiccation-resistant flies (Fabian et al 2012)
GO Analysis
GO analyses was used to associate the candidates with theirbiological functions (Ashburner et al 2000) Previously devel-oped for biological pathway analysis of global gene expressionstudies (Wang et al 2010) typical GO analysis assumes that allgenes are sampled independently with equal probabilityGWAS violate these criteria where SNPs are more likely to
occur in longer genes than in shorter genes resulting in highertype I error rates in GO categories overpopulated with longgenes Gene length as well as overlapping gene biases (Wanget al 2011) were accounted for using Gowinda (Kofler andSchleurootterer 2012) which implements a permutation strategyto reduce false positives Gowinda does not reconstruct LDbetween SNPs but operates in two modes underpinned bytwo extreme assumptions 1) Gene mode (assumes all SNPs ina gene are in LD) and 2) SNP mode (assumes all SNPs in agene are completely independent)
The candidates were examined in both modes but giventhe stringency of gene mode high-resolution SNP modeoutput was utilized for the final analysis with the followingparameters 106 simulations minimum significance frac141 andminimum genes per GO categoryfrac14 3 (to exclude gene-poorcategories) P values were corrected using the Gowinda FDRalgorithm GO annotations from Flybase r557 were applied
Candidate Gene Analysis
In conjunction with the GO analysis we also examinedknown candidate genes associated with survival under lowRH to better characterize possible biological functions of ourGWAS candidates We took a conservative approach to can-didate gene assignment where the SNPs were mapped withinthe entire gene region without up- and downstream exten-sions Genes were curated from FlyBase and the literature(table 1) predominantly using detailed functional experimen-tal evidence as criteria for functional desiccation candidates
Network Analysis
We next obtained a higher level summary of the candidategene list using PPI network analysis The full candidate genelist was used to construct a first-order interaction networkusing all Drosophila PPIs listed in the Drosophila InteractionDatabase v 2014_10 (avoiding interologs) (Murali et al 2011)This ldquofullrdquo PPI network includes manually curated data frompublished literature and experimental data for 9633 proteinsand 93799 interactions For the network construction theigraph R package was used to build a subgraph from the fullPPI network containing the candidate proteins and their first-order interacting proteins (Csardi and Nepusz 2006) Self-connections were removed Functional enrichment analysiswas conducted on the list of network genes using FlyMinev400 (wwwflymineorg last accessed January 13 2016) withthe following parameters for GO enrichment and KEGGReactome pathway enrichment BenjaminindashHochberg FDRcorrection P lt 005 background frac14 all D melanogastergenes in our full PPI network and Ontology (for GO enrich-ment only) frac14 biological process or molecular function
Cross-Study Comparison Overlap with Our PreviousWork
Following on from the comparison between our gene list andthe candidate genes suggested from the literature we quan-titatively evaluated the degree of overlap between the candi-date genes identified in this study with alleles mapped in ourprevious study of flies collected from the same geographic
12
region (Telonis-Scott et al 2012) The ldquooverlaprdquo analysis wasperformed similarly for the GWAS candidate gene and net-work-level approaches First the gene list overlap was testedbetween the two studies using Fisherrsquos exact test where the382 genes identified from the candidate SNPs (this study)were compared with 416 genes identified from candidatesingle feature polymorphisms (Telonis-Scott et al 2012)The contingency table was constructed using the numberof annotated genes in the D melanogaster R553 genomebuilt as an approximation for the total number of genestested in each study and took the following form
where A frac14 gene list from Telonis-Scott et al (2012)n frac14 416 B frac14 gene list from this study n frac14 382 N frac14 totalnumber of genes in D melanogaster reference genome r553n frac14 17106
To annotate genome features in the overlap candidategene list we used the SNP data (this study) which betterresolves alleles than array-based genotyping Due to this lim-itation we did not compare exact genomic coordinates be-tween the studies but considered overlap to occur whendifferentiated alleles were detected in the same gene inboth studies We did however examine the allele frequenciesaround the Dys-RF promoter which was sequenced sepa-rately revealing a selective sweep in the experimental evolu-tion lines (Telonis-Scott et al 2012)
The overlapping candidates were tested for functional en-richment using a basic gene list approach As genes ratherthan SNPs were analyzed here Fly Mine v400 (wwwflymineorg) was applied with the following parameters Gene lengthnormalization BenjaminindashHochberg FDR correction Plt 005Ontology frac14 biological process and default background (allGO-annotated D melanogaster genes)
Finally we constructed a second PPI network from thecandidate genes mapped in Telonis-Scott et al (2012) to ex-amine system-level overlap between the two studies The 416genes mapped to 337 protein seeds and the network wasconstructed as described above The degree of network over-lap between the two studies was tested using simulation Ineach iteration a gene set of the same length as the list fromthis study and a gene set of the same length as the list fromTelonis-Scott et al (2012) were resampled randomly from theentire D melanogaster mapped gene annotation (r553) Eachresampled set was then used to build a first-order network aspreviously described The overlap network between the tworesampled sets was calculated using the graphintersectioncommand from the igraph package and zero-degree nodes(proteins with no interactions) were removed Overlap wasquantified by counting the number of nodes and the numberof edges in this overlap network The observed overlap wascompared with a simulated null distribution for both nodeand edge number where n frac14 1000 simulation iterations
Supplementary MaterialSupplementary figures S1ndashS4 and tables S1ndashS5 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
We thank Robert Good for the fly sampling and JenniferShirriffs and Lea and Alan Rako for outstanding assistancewith laboratory culture and substantial desiccation assaysWe are grateful to Melissa Davis for advice on network build-ing and comparison approaches and to five anonymous re-viewers whose comments improved the manuscript Thiswork was supported by a DECRA fellowship DE120102575and Monash University Technology voucher scheme toMTS an ARC Future Fellowship and Discovery Grants toCMS an ARC Laureate Fellowship to AAH and a Scienceand Industry Endowment Fund grant to CMS and AAH
ReferencesAlexander JM Lomvardas S 2014 Nuclear architecture as an epigenetic
regulator of neural development and function Neuroscience26439ndash50
Anderson AR Hoffmann AA McKechnie SW Umina PA Weeks AR2005 The latitudinal cline in the In(3R)Payne inversion polymor-phism has shifted in the last 20 years in Australian Drosophilamelanogaster populations Mol Ecol 14851ndash858
Andolfatto P Wall JD Kreitman M 1999 Unusual haplotype structureat the proximal breakpoint of In(2L)t in a natural population ofDrosophila melanogaster Genetics 1531297ndash1311
Andrews S 2014 FastQC Available from httpwwwbioinformaticsbbsrcacukprojectsfastqc
Ashburner M Ball CA Blake JA Botstein D Butler H Cherry JM DavisAP Dolinski K Dwight SS Eppig JT et al 2000 Gene ontology toolfor the unification of biology The Gene Ontology Consortium NatGenet 2525ndash29
Aytes A Mitrofanova A Lefebvre C Alvarez Mariano J Castillo-MartinM Zheng T Eastham James A Gopalan A Pienta Kenneth J ShenMM et al 2014 Cross-species regulatory network analysis identifiesa synergistic interaction between FOXM1 and CENPF that drivesprostate cancer malignancy Cancer Cell 25638ndash651
Bastide H Betancourt A Nolte V Tobler R Steuroobe P Futschik ASchleurootterer C 2013 A genome-wide fine-scale map of natural pig-mentation variation in Drosophila melanogaster PLoS Genet9e1003534
Bergland AO Behrman EL OrsquoBrien KR Schmidt PS Petrov DA 2014Genomic evidence of rapid and stable adaptive oscillations overseasonal time scales in Drosophila PLoS Genet 10e1004775
Bonebrake TC Mastrandrea MD 2010 Tolerance adaptation and pre-cipitation changes complicate latitudinal patterns of climate changeimpacts Proc Natl Acad Sci U S A 10712581ndash12586
Bradley TJ 2009 Animal osmoregulation Oxford Oxford UniversityPress
Burke MK Liti G Long AD 2014 Standing genetic variation drivesrepeatable experimental evolution in outcrossing populations ofSaccharomyces cerevisiae Mol Biol Evol 313228ndash3239
Chown SL 2012 Trait-based approaches to conservation physiologyforecasting environmental change risks from the bottom upPhilos Trans R Soc B 3671615ndash1627
Chown SL Nicolson SW 2004 Insect physiological ecology mechanismsand patterns Oxford Oxford University Press
Chown SL Soslashrensen JG Terblanche JS 2011 Water loss in insectsan environmental change perspective J Insect Physiol571070ndash1084
Not in GWAS In GWAS
Not in Telonis-Scott et al (2012) N ndash intersect-AndashB A-intersect
In Telonis-Scott et al (2012) B-intersect Intersect(AB)
13
Chung H Loehlin DW Dufour HD Vaccarro K Millar JG Carroll SB2014 A single gene affects both ecological divergence and matechoice in Drosophila Science 3431148ndash1151
Cingolani P Platts A Wang LL Coon M Nguyen T Wang L Land SJ LuX Ruden DM 2012 A program for annotating and predicting theeffects of single nucleotide polymorphisms SnpEff SNPs in thegenome of Drosophila melanogaster strain w1118 iso-2 iso-3 Fly(Austin) 680ndash92
Civelek M Lusis AJ 2014 Systems genetics approaches to understandcomplex traits Nat Rev Genet 1534ndash48
Clusella-Trullas S Blackburn TM Chown SL 2011 Climatic predictors oftemperature performance curve parameters in ectotherms implycomplex responses to climate change Am Nat 177738ndash751
Csardi G Nepusz T 2006 The igraph software package for complexnetwork research InterJournal Complex Systems 1695 Availablefrom httpigraphorg
Davies SA Cabrero P Overend G Aitchison L Sebastian S Terhzaz SDow JA 2014 Cell signalling mechanisms for insect stress toleranceJ Exp Biol 217119ndash128
Davies SA Cabrero P Povsic M Johnston NR Terhzaz S Dow JA 2013Signaling by Drosophila capa neuropeptides Gen Comp Endocrinol18860ndash66
Davies SA Overend G Sebastian S Cundall M Cabrero P Dow JATerhzaz S 2012 Immune and stress response lsquocross-talkrsquo in theDrosophila Malpighian tubule J Insect Physiol 58488ndash497
Day JP Dow JA Houslay MD Davies SA 2005 Cyclic nucleotide phos-phodiesterases in Drosophila melanogaster Biochem J 388333ndash342
de Nadal E Ammerer G Posas F 2011 Controlling gene expression inresponse to stress Nat Rev Genet 12833ndash845
DePristo MA Banks E Poplin R Garimella KV Maguire JR Hartl CPhilippakis AA del Angel G Rivas MA Hanna M et al 2011 Aframework for variation discovery and genotyping using next-gen-eration DNA sequencing data Nat Genet 43491ndash498
Djawdan M Chippindale AK Rose MR Bradley TJ 1998 Metabolic re-serves and evolved stress resistance in Drosophila melanogasterPhysiol Zool 71584ndash594
Durham MF Magwire MM Stone EA Leips J 2014 Genome-wide anal-ysis in Drosophila reveals age-specific effects of SNPs on fitness traitsNat Commun 54338
Emilsson V Thorleifsson G Zhang B Leonardson AS Zink F Zhu JCarlson S Helgason A Walters GB Gunnarsdottir S et al 2008Genetics of gene expression and its effect on disease Nature452423ndash428
Fabian DK Kapun M Nolte V Kofler R Schmidt PS Schleurootterer C FlattT 2012 Genome-wide patterns of latitudinal differentiation amongpopulations of Drosophila melanogaster from North America MolEcol 214748ndash4769
Foley BR Telonis-Scott M 2011 Quantitative genetic analysis suggestscausal association between cuticular hydrocarbon composition anddesiccation survival in Drosophila melanogaster Heredity 10668ndash77
Folk DG Han C Bradley TJ 2001 Water acquisition and partitioning inDrosophila melanogaster effects of selection for desiccation-resis-tance J Exp Biol 2043323ndash3331
Fowler MA Montell C 2013 Drosophila TRP channels and animal be-havior Life Sci 92394ndash403
Fu J Keurentjes JJB Bouwmeester H America T Verstappen FWA WardJL Beale MH de Vos RCH Dijkstra M Scheltema RA et al 2009System-wide molecular evidence for phenotypic buffering inArabidopsis Nat Genet 41166ndash167
Futschik A Schleurootterer C 2010 The next generation of molecular mar-kers from massively parallel sequencing of pooled DNA samplesGenetics 186207ndash218
Gibbs AG Fukuzato F Matzkin LM 2003 Evolution of water conserva-tion mechanisms in Drosophila J Exp Biol 2061183ndash1192
Gibbs AG Matzkin LM 2001 Evolution of water balance in the genusDrosophila J Exp Biol 2042331ndash2338
Goodstadt L 2010 Ruffus a lightweight Python library for computa-tional pipelines Bioinformatics 262778ndash2779
Hadley NF 1994 Water relations of terrestrial arthropods San Diego(CA) Academic Press
Hoffmann AA Hallas R Sinclair C Mitrovski P 2001 Levels of variationin stress resistance in Drosophila among strains local populationsand geographic regions patterns for desiccation starvation coldresistance and associated traits Evolution 551621ndash1630
Hoffmann AA Parsons PA 1989a An integrated approach to environ-mental stress tolerance and life-history variation desiccation toler-ance in Drosophila Biol J Linn Soc 37117ndash136
Hoffmann AA Parsons PA 1989b Selection for increased desiccationresistance in Drosophila melanogaster additive genetic control andcorrelated responses for other stresses Genetics 122837ndash845
Hoffmann AA Sgro CM 2011 Climate change and evolutionary adap-tation Nature 470479ndash485
Hoffmann AA Weeks AR 2007 Climatic selection on genes and traitsafter a 100 year-old invasion a critical look at the temperate-tropicalclines in Drosophila melanogaster from eastern Australia Genetica129133ndash147
Huang W Richards S Carbone MA Zhu D Anholt RR Ayroles JFDuncan L Jordan KW Lawrence F Magwire MM et al 2012Epistasis dominates the genetic architecture of Drosophila quanti-tative traits Proc Natl Acad Sci U S A 10915553ndash15559
Huang Z Tunnacliffe A 2004 Response of human cells to desiccationcomparison with hyperosmotic stress response J Physiol558181ndash191
Huang Z Tunnacliffe A 2005 Gene induction by desiccation stress inhuman cell cultures FEBS Lett 5794973ndash4977
Ichihashi Y Aguilar-Martınez JA Farhi M Chitwood DH Kumar RMillon LV Peng J Maloof JN Sinha NR 2014 Evolutionary develop-mental transcriptomics reveals a gene network module regulatinginterspecific diversity in plant leaf shape Proc Natl Acad Sci U S A111E2616ndashE2621
Itoh M Nanba N Hasegawa M Inomata N Kondo R Oshima MTakano-Shimizu T 2010 Seasonal changes in the long-distance link-age disequilibrium in Drosophila melanogaster J Hered 10126ndash32
Johnson WA Carder JW 2012 Drosophila nociceptors mediate larvalaversion to dry surface environments utilizing both the painless TRPchannel and the DEGENaC subunit PPK1 PLoS One 7e32878
Kahsai L Kapan N Dircksen H Winther AM Nassel DR 2010 Metabolicstress responses in Drosophila are modulated by brain neurosecre-tory cells that produce multiple neuropeptides PLoS One 5e11480
Kapun M van Schalkwyk H McAllister B Flatt T Schleurootterer C 2014Inference of chromosomal inversion dynamics from Pool-Seq datain natural and laboratory populations of Drosophila melanogasterMol Ecol 231813ndash1827
Kawano T Shimoda M Matsumoto H Ryuda M Tsuzuki S Hayakawa Y2010 Identification of a gene Desiccate contributing to desicca-tion resistance in Drosophila melanogaster J Biol Chem28538889ndash38897
Kellermann V van Heerwaarden B Sgro CM Hoffmann AA 2009Fundamental evolutionary limits in ecological traits driveDrosophila species distributions Science 3251244ndash1246
Kofler R Nolte V Schleurootterer C 2016 The impact of library preparationprotocols on the consistency of allele frequency estimates in Pool-Seq data Mol Ecol Resour 16118ndash122
Kofler R Pandey RV Schleurootterer C 2011 PoPoolation2 identifying dif-ferentiation between populations using sequencing of pooled DNAsamples (Pool-Seq) Bioinformatics 273435ndash3436
Kofler R Schleurootterer C 2012 Gowinda unbiased analysis of gene setenrichment for genome-wide association studies Bioinformatics282084ndash2085
Leiserson MDM Eldridge JV Ramachandran S Raphael BJ 2013Network analysis of GWAS data Curr Opin Genet Dev 23602ndash610
Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 251754ndash1760
Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079
14
Liu Y Luo J Carlsson MA Neuroassel DR 2015 Serotonin and insulin-likepeptides modulate leucokinin-producing neurons that affectfeeding and water homeostasis in Drosophila J Comp Neurol5231840ndash1863
Lohse M Bolger AM Nagel A Fernie AR Lunn JE Stitt M Usadel B 2012RobiNA a user-friendly integrated software solution for RNA-Seq-based transcriptomics Nucleic Acids Res 40W622ndashW627
Long A Liti G Luptak A Tenaillon O 2015 Elucidating the moleculararchitecture of adaptation via evolve and resequence experimentsNat Rev Genet 16567ndash582
Lynch M Bost D Wilson S Maruki T Harrison S 2014 Population-genetic inference from pooled-sequencing data Genome Biol Evol61210ndash1218
Mackay TFC Stone EA Ayroles JF 2009 The genetics of quantitativetraits challenges and prospects Nat Rev Genet 10565ndash577
Marron MT Markow TA Kain KJ Gibbs AG 2003 Effects of starvationand desiccation on energy metabolism in desert and mesicDrosophila J Insect Physiol 49261ndash270
Matzkin LM Markow TA 2009 Transcriptional regulation of metabo-lism associated with the increased desiccation resistance of thecactophilic Drosophila mojavensis Genetics 1821279ndash1288
Murali T Pacifico S Yu J Guest S Roberts GG Finley RL Jr 2011 DroID2011 a comprehensive integrated resource for protein transcrip-tion factor RNA and gene interactions for Drosophila Nucleic AcidsRes 39D736ndashD743
Nuzhdin SV Turner TL 2013 Promises and limitations of hitchhikingmapping Curr Opin Genet Dev 23694ndash699
Orozco-terWengel P Kapun M Nolte V Kofler R Flatt T Schleurootterer C2012 Adaptation of Drosophila to a novel laboratory environmentreveals temporally heterogeneous trajectories of selected alleles MolEcol 214931ndash4941
Picard Picard A set of Java command line tools for manipulating high-throughput seqeuncing (HTS) data and formats Available fromhttpbroadinstitutegithubiopicard
Qiu Y Tittiger C Wicker-Thomas C Le Goff G Young S Wajnberg EFricaux T Taquet N Blomquist GJ Feyereisen R 2012 An insect-specific P450 oxidative decarbonylase for cuticular hydrocarbon bio-synthesis Proc Natl Acad Sci U S A 10914858ndash14863
Rajpurohit S Oliveira CC Etges WJ Gibbs AG 2013 Functional genomicand phenotypic responses to desiccation in natural populations of adesert Drosophilid Mol Ecol 222698ndash2715
Sarup P Soslashrensen JG Kristensen TN Hoffmann AA Loeschcke V PaigeKN Soslashrensen P 2011 Candidate genes detected in transcriptomestudies are strongly dependent on genetic background PLoS One6e15644
Schiffer M Hangartner S Hoffmann AA 2013 Assessing the relativeimportance of environmental effects carry-over effects and speciesdifferences in thermal stress resistance a comparison ofDrosophilids across field and laboratory generations J Exp Biol2163790ndash3798
Schleurootterer C Kofler R Versace E Tobler R Franssen SU 2015Combining experimental evolution with next-generation sequen-cing A powerful tool to study adaptation from standing geneticvariation Heredity 114431ndash440
Sinclair BJ Gibbs AG Roberts SP 2007 Gene transcription during ex-posure to and recovery from cold and desiccation stress inDrosophila melanogaster Insect Mol Biol 16435ndash443
Sloggett C Wakefield M Philip G Pope B 2014 Rubramdashflexible distrib-uted pipelines figshare Available from httpdxdoiorg106084m9figshare895626
Smit AFA Hubley R Green P 2013-2015 RepeatMasker Open-403Available from httpwwwrepeatmaskerorg
Seurooderberg JA Birse RT Neuroassel DR 2011 Insulin production and signalingin renal tubules of Drosophila is under control of tachykinin-relatedpeptide and regulates stress resistance PLoS One 6e19866
Soslashrensen JG Nielsen MM Loeschcke V 2007 Gene expression profileanalysis of Drosophila melanogaster selected for resistance to envi-ronmental stressors J Evol Biol 201624ndash1636
St Pierre SE Ponting L Stefancsik R McQuilton P 2014 FlyBase102mdashadvanced approaches to interrogating FlyBase Nucleic AcidsRes 42D780ndashD788
Stevenson M Nunes T Heuser C Marshall J Sanchez J Thornton RReiczigel J Robison-Cox J Sebastiani P Solymos P et al 2015 epiRTools for the analysis of epidemiological data R package version09-62 Available from httpCRANR-projectorgpackagefrac14epiR
Storey JD Tibshirani R 2003 Statistical significance for genomewidestudies Proc Natl Acad Sci U S A 1009440ndash9445
Telonis-Scott M Gane M DeGaris S Sgro CM Hoffmann AA 2012 Highresolution mapping of candidate alleles for desiccation resistancein Drosophila melanogaster under selection Mol Biol Evol291335ndash1351
Telonis-Scott M Guthridge KM Hoffmann AA 2006 A new set oflaboratory-selected Drosophila melanogaster lines for the analysisof desiccation resistance response to selection physiology and cor-related responses J Exp Biol 2091837ndash1847
Terhzaz S Cabrero P Robben JH Radford JC Hudson BD Milligan GDow JA Davies SA 2012 Mechanism and function of Drosophilacapa GPCR a desiccation stress-responsive receptor with functionalhomology to human neuromedinU receptor PLoS One 7e29897
Terhzaz S Overend G Sebastian S Dow JA Davies SA 2014 The Dmelanogaster capa-1 neuropeptide activates renal NF-kB signalingPeptides 53218ndash224
Terhzaz S Teets NM Cabrero P Henderson L Ritchie MG Nachman RJDow JA Denlinger DL Davies SA 2015 Insect capa neuropeptidesimpact desiccation and cold tolerance Proc Natl Acad Sci U S A1122882ndash2887
Thorat LJ Gaikwad SM Nath BB 2012 Trehalose as an indicator ofdesiccation stress in Drosophila melanogaster larvae a potentialmarker of anhydrobiosis Biochem Biophys Res Commun419638ndash642
Tobler R Franssen SU Kofler R Orozco-terWengel P Nolte VHermisson J Schl eurootterer C 2014 Massive habitat-specific ge-nomic response in D melanogaster populations during exper-imental evolution in hot and cold environments Mol Biol Evol31364ndash375
Turner TL Bourne EC Von Wettberg EJ Hu TT Nuzhdin SV 2010Population resequencing reveals local adaptation of Arabidopsislyrata to serpentine soils Nat Genet 42260ndash263
Turner TL Miller PM 2012 Investigating natural variation in Drosophilacourtship song by the evolve and resequence approach Genetics191633ndash642
Vermeulen CJ Soslashrensen P Kirilova Gagalova K Loeschcke V 2013Transcriptomic analysis of inbreeding depression in cold-sensitiveDrosophila melanogaster shows upregulation of the immune re-sponse J Evol Biol 261890ndash1902
Wang B Goode J Best J Meltzer J Schilman PE Chen J Garza D ThomasJB Montminy M 2008 The insulin-regulated CREB coactivatorTORC promotes stress resistance in Drosophila Cell Metab7434ndash444
Wang J Kean L Yang J Allan AK Davies SA Herzyk P Dow JA 2004Function-informed transcriptome analysis of Drosophila renaltubule Genome Biol 5R69
Wang K Li M Hakonarson H 2010 Analysing biological pathways ingenome-wide association studies Nat Rev Genet 11843ndash854
Wang L Jia P Wolfinger RD Chen X Zhao Z 2011 Gene set analysis ofgenome-wide association studies methodological issues and per-spectives Genomics 981ndash8
Weber AL Khan GF Magwire MM Tabor CL Mackay TF Anholt RR2012 Genome-wide association analysis of oxidative stress resistancein Drosophila melanogaster PLoS One 7e34745
Weeks AR McKechnie SW Hoffmann AA 2002 Dissecting adaptiveclinal variation markers inversions and sizestress associations inDrosophila melanogaster from a central field population Ecol Lett5756ndash763
Zhu Y Bergland AO Gonzalez J Petrov DA 2012 Empirical validation ofpooled whole genome population re-sequencing in Drosophila mel-anogaster PLoS One 7e41901
15
Supplementary Material online mean across control repli-cates 00028 median 00015 mean across resistant replicates00027 median 00016) corresponding to a mean SD of~45 around the mean allele frequency The mean pairwisedifference in allele frequency estimates was 55 and 54 forcontrol and desiccation-resistant replicates respectively con-sistent with previously published estimates (4ndash6 Kofleret al 2016)
For a test subset of 1803 randomly sampled single nucle-otide polymorphisms (SNPs) that showed no significant dif-ferentiation between control and desiccation-resistant poolsthe replicates accounted for the majority of total variance inallele frequency Replicates within pool category (control vsdesiccation resistant supplementary fig S1 SupplementaryMaterial online) explained between 15 and 100 (mean87 median 93) of what was a relatively low total variance(mean 00029 median 00019) The mean concordance cor-relation coefficient was 098 for each pool category higherthan a comparable Drosophila study that reported a correla-tion of 0898 between 2 replicates with 20 individuals(Zhu et al 2012) We therefore combined the technical rep-licates into a single control and resistant pool respectively forfurther analyses After processing the mean coverage depth
across the genome was 487 and 568 per sample respec-tively (equivalent to 097 and 11 per individual in thepools of 500 flies)
Allele Frequency Changes
The allele frequencies of 3528158 SNPs were comparedbetween the 2 pools using Fisherrsquos exact tests with apermutation-based false discovery rate (FDR) correctionchosen based on the top 005 of the simulated nullP values Six hundred forty-eight SNPs were differentiatedbetween desiccation resistant and control pools (fig 1 andsupplementary table S1 Supplementary Material online)The SNPs were nonrandomly distributed across chromo-somes (X2
4 frac14 4312 Plt 00001) highly abundant on theX chromosome (X2
1 frac14 3698 Plt 00001) while SNPs wereunderrepresented on both 2L and 3L (2L X2
1 frac14 641 Plt001 3L X2
1 frac14 397 Plt 005) In the desiccation pool theallelic distribution of the 648 SNPs ranged from low tofixed (4ndash100 fig 1B and supplementary table S1Supplementary Material online) The increase in frequencyof the favored ldquodesiccationrdquo alleles compared with the con-trols ranged from 3 to 41 (median increase 14 supple-mentary table S1 Supplementary Material online) where
Fig 1 (A) Manhattan plot of genome-wide P values for differentiation between desiccation-resistant and random control pools shown for the entiregenome where each point represents one SNP SNPs highlighted in red showed differentiation above the 005 threshold level determined from the nullP value distribution (see text for details) SNPs highlighted in blue were above the 005 threshold and also detected in Telonis-Scott et al (2012) (B)Standing allele frequency in the control pool plotted against the frequency change in the selected pool for the 648 differentiated SNPs (FDR 005)
3
the largest shifts tended to occur in common intermediatefrequencies (fig 1B)
Inversion Frequencies
To test whether inversions were associated with desiccationresistance allele frequencies were checked at SNPs diagnosticfor the following inversions In(3R)P (Anderson et al 2005Kapun et al 2014) In(2L)t (Andolfatto et al 1999 Kapun et al2014) In(2R)NS In(3L)P In(3R)C In(3R)K and In(3R)Mo(Kapun et al 2014) In(3L)P In(3R)K and In(3R)Mo werenot detected in this population and for all other inversionsinvestigated the control and desiccation-resistant pool fre-quencies did not differ more than 6
Genomic Distribution
We examined these candidates using a comprehensive ap-proach encompassing SNP gene functional ontology andnetwork levels First we examined SNP locations with respectto gene structure and putative effects of SNPs on genomefeatures The 648 SNPs mapped to 382 genes and were largelynoncoding (79 supplementary table S2 SupplementaryMaterial online) Coding variants comprised nearly 15 ofcandidate SNPs largely synonymous substitutions and theremainder nonsynonymous missense variants predicted toresult in amino acid substitution with length preservation(9 and 14 supplementary table S2 SupplementaryMaterial online) Based on proportions of features from allannotated SNPs compared with the candidates intron vari-ants were significantly underrepresented (X2
1 frac14706 Plt 001supplementary table S2 Supplementary Material online)while SNPs were enriched in 50UTRs 50UTR premature startcodon sites and splice regions (X2
1 frac14 820 Plt 001X2
1 frac14 3124 Plt 00001 X21 frac14 640 Plt 005 respectively
supplementary table S2 Supplementary Material online)
GO Functional Enrichment
For the GO analyses we found no overrepresented termsusing the stringent Gowinda gene mode but identified sev-eral categories using SNP mode with a 005 FDR thresholdGO terms were enriched for the broad categories of ldquochro-matinrdquo ldquochromosomerdquo and more specifically ldquodouble-stranded RNA bindingrdquo although significance of this categorywas due solely to nine differentiated SNPs in the DIP1 geneThe term ldquogamma-secretase complexrdquo was also enriched dueto three SNPs in the pen-2 gene
Desiccation Candidate Genes
Chown et al (2011) comprehensively summarized the pri-mary mechanisms underpinning desiccation resistancefrom the critical first phase of stress sensing and behavioralavoidance to the manifold physiological trajectories availableto counter water stress These include water content varia-tion water loss rates (desiccation resistance) and tolerance ofwater loss (dehydration tolerance) We expanded on thisreview to incorporate recent molecular research and summa-rized the primary mechanisms and known candidate genesrelevant to D melanogaster stress sensing and water balance
(table 1 and references therein) Informed by this body ofwork we applied a ldquotop-downrdquo mechanistic approach toscreening desiccation candidate genes in our data (egfrom initial hygrosensing to downstream key physiologicalpathways table 1) We did not observe differentiation inSNPs mapping to the TRP ion channels the key hygrosensingreceptor genes However we did map SNPs to genes involvedin stress sensing specific to the MT fluid secretion signalingpathways including two of the six cAMP and cyclic guanosinemonophosphate (cGMP) signaling phosphodiesterases dncand pde9 as well as klu indicated in NF-kB ortholog signalingin the renal tubules (table 1) A number of the stress-sensingSAPK pathway genes were differentiated (table 1) Key genesinvolved in the insulin signaling pathway implicated in met-abolic homeostasis and water balance were also differenti-ated including the insulin receptor InR and insulin receptorbinding Dok (table 1)
Network Analysis Modeling Genomic Differentiationfor Desiccation in a PPI Context
Given the genomic complexity underpinning desiccation re-sistance we also implemented a higher-order analysis to pro-vide an overview of potential architectures beyond the SNPand gene levels using PPI networks The SNPs mapped to 382genes which were annotated to 307 seed proteins occurring inthe full background D melanogaster PPI network The seedproteins produced a large first-order network with 3324nodes and 51299 edges
To determine if broader functional signatures could beascertained from the complex network we employed func-tional enrichment analyses on the 3324 nodes (table 2 andsupplementary table S3 Supplementary Material online)Strikingly the network was enriched for pathways involvedin gene expression from transcription through RNA process-ing to RNA transport metabolism and decay (table 2)The network analysis also revealed further complexity ofthe stress response with enrichment of proteins involved inregulating innate insect immunity (table 2) where multiplepathways were overrepresented including cytokine interleu-kin and toll-like receptor signaling cascades (table 2)Developmentalgene regulation signaling pathways were sig-nificant in the pathway analysis the KEGG database showedenrichment for Notch and Mitogen-activated protein kinase(MAPK) signaling while insulin and NGF pathways were high-lighted from the Reactome database (table 2)
Common Signatures of Desiccation Resistance acrossStudies
The key finding of this study is the degree of genomic overlapwith our previous mapping of alleles associated with artifi-cially selected desiccation phenotypes (Telonis-Scott et al2012) Although the flies were sampled from comparablesoutheastern Australian locations the study designs were dif-ferent Specifically the earlier study focused on resistancegenerated by artificial selection whereas this study focusedon natural variation in resistance Despite the differences instudy design we identified 45 genes common to both studies
4
Table 1 Summary of Primary Molecular Responses and Candidate Genes for Desiccation Response Mechanisms Identified in the Literature
Mechanism Molecular Function Genes Current Study(F1 wild-caught)
Telonis-Scott et al(2012) (F8 artificial
selection)
Hygrosensing Sensory moisture receptors (antennae legs gut mouthparts)
Temperature-activatedtransient receptor po-tential ionchannelsbcd
TRP ion channels trpl trp Trpg TrpA1 TrpmTrpml pain pyx wtrwnompC iav nan Amo
mdash trpl
Stress sensing Malpighian tubules (principal cells)
Tubule fluid secretion signaling pathways
cAMP signalingefg cAMP activation Dh44-R2 mdash mdash
cGMP signalingefg cGMP activation Capa CapaRcedilNplp1-4 mdash mdash
Receptors Gyc76c CG33958 CG34357 mdash CG34357
Kinases dg1-2 mdash mdash
cAMP + cGMPsignalingefgh
Hydrolyzing phosphodiesterases dnc pde1c pde6 pde9 pde11 dnc pde9 dnc pde1c pde9 pde11
Calcium signalingeg Calcium-sensitive nitric oxidesynthase
Nos mdash Nos
Desiccation specific Diuretic neuropeptide NFkBsignalingijk
Capa CapaR trpl Relish klu klu klu trpl
Anti-diureticneuropeptidel ITP sNPF Tk mdash ITP sNPF
Stress sensing Stress responsive pathway
Stress activatedprotein kinasepathwaymn
MAPKJun kinase
Aop Aplip1 Atf-2 bsk Btk29Acbt Cdc 42 Cka cno CYLDDok egr Gadd45 hep hppyJra kay medo Mkk4 msnMtl Pak puc pyd Rac 1-2Rho1 shark slpr Src42ASrc64B Tak1 Traf6
Aop Dokhepcedilkaymsn Mtlpuc Rac1Src64B
Mtl slpr
Metabolic homeostasis and water balance
Components of insu-lin signaling pathway
Receptorop InR InR mdash
Receptor binding chico IIp1-8 dock Dok poly Dok Poly
Resistance mechanisms Water loss barriers
Cuticularhydrocarbons
Fatty acid synthasesq desatura-sesr transporters oxidativedecarbonylaset
CG3523 CG3524 desat1-2 FatPCyp4g1
CG3523 mdash
Primary hemolymph sugartissue-protectant
Trehalose metabolism Trehalose-phosphatasebu treh-lase activitybu phosphoglyc-erate mutase
Tps1 Treh Pgm CG5171CG5177 CG6262 crc
Treh Pgm mdash
NOTEmdashCandidate genes differentiated in the current study and in Telonis-Scott et al (2012) are shown and are underlined where overlappingaFor brevity several comprehensive reviews are cited and readers are referred to references therein Note that water budgeting via discontinuous gas exchange was not includeddue to a lack of ldquobona fiderdquo molecular candidates and desiccation tolerance due to aquaporins was omitted as genes were not differentiated in either of our studiesbChown et al (2011)cJohnson and Carder (2012)dFowler and Montell (2013)eDay et al (2005)fDavies et al (2012)gDavies et al (2013)hDavies et al (2014)iTerhzaz et al (2012)jTerhzaz et al (2014)kTerhzaz et al (2015)lKahsai et al (2010)mHuang and Tunnacliffe (2004)nHuang and Tunnacliffe (2005)oSeurooderberg et al (2011)pLiu et al (2015)qChung et al (2014)rSinclair et al (2007)rFoley and Telonis-Scott (2011)tQiu et al (2012)uThorat et al (2012)
5
Conserved Genomic Signatures of Drosophila Desiccation Resistance doi101093molbevmsv349 MBE by guest on February 6 2016
httpmbeoxfordjournalsorg
Dow
nloaded from
out of the 382 and 416 genes for the current and 2012 studyrespectively (Fisherrsquos exact test for overlap P lt 2 1016odds ratio frac14 590 supplementary table S4 SupplementaryMaterial online) Bearing gene length bias in mind we exam-ined the average length of genes in both studies as well as thegene overlap to the average gene length in the Drosophilagenome (supplementary fig S3 Supplementary Materialonline) Although there was bias toward longer genes in theoverlap list this was also apparent in the full gene lists fromboth studies
The overlap genes mapped to all of the major chromo-somes and their differentiated variants were mostly noncod-ing SNPs located in introns (supplementary table S4Supplementary Material online) Two SNPs were synonymous(Strn-Mlick coding for a protein kinase and Cht7 chitinbinding supplementary table S4 Supplementary Materialonline) while a 30UTR variant was predicted to putativelyalter protein structure in a nondisruptive manner mostlikely impacting expression andor protein translational effi-ciency (jbug response to stimulus supplementary table S4Supplementary Material online) One nonsynonymous SNPwas predicted to alter protein function as a missense variant(fab1 signaling supplementary table S4 SupplementaryMaterial online) The increase in frequency of the favoreddesiccation alleles compared with the controls ranged from5 to 27 with a median increase of 164
Despite the strong overlap there were important differ-ences between the two studies For example our previouswork revealed evidence of a selective sweep in the 50 pro-moter region of the Dys gene in the artificially selected linesand we confirmed a higher frequency of the selected SNPs inan independent natural association study in resistant fliesfrom Coffs Harbour (Telonis-Scott et al 2012) In this studywe found no differentiation at the Dys locus between ourresistant and random controls However both resistant andcontrol pools had a high frequency of the selected allele de-tected in Telonis-Scott et al (2012) (supplementary table S5Supplementary Material online)
The 45 genes common to both studies were analyzed forbiological function while many genes are obviously pleiotro-pic (ie assigned multiple GO terms) almost half were in-volved in response to stimuli (cellular behavioral andregulatory) including signaling such as Rho protein signaltransduction and cAMP-mediated signaling (table 3) Othercategories of interest include spiracle morphogenesis sensoryorgan development including development of the compoundeye and stem cell and neuroblast differentiation (table 3)
With regard to specific stress responsedesiccation candi-dates (table 1) 4 of the 45 genes overlapped including thecAMPcGMP signaling phosphodiesterases pde9 and dncMAPK pathway genes Mtl and klu indicated in NF-kBorthologue signaling in the renal tubules (table 1)Furthermore there was overlap among mechanistic catego-ries where different genes in the same pathways weredetected For example additional key cAMPcGMP hydrolyz-ing phosphodiesterases pde1c and pde11 were significant inTelonis-Scott et al (2012) as well as slpr (MAPK pathway)and poly (insulin signaling) (table 1)
Table 2 Pathway Enrichment of the GWAS and Telonis-Scott et alFirst-Order PPI Networks Investigated Using FlyMine (FDR P lt 005)
Current Study GWAS Telonis-Scott et al(2012)
Database Pathway Pathway
KEGG Ribosome RibosomeNotch signaling pathway Notch signaling pathwayMAPK signaling pathwayDorso-ventral axis formationProgesterone-mediated oocyte
maturationWnt signaling pathwayEndocytosisProteasome
Reactome Immune system Immune systemAdaptive immune system Adaptive immune systemInnate immune system Innate immune systemSignaling by interleukins Signaling by interleukins(Toll-like receptor cascades) Toll-like receptor
cascadesTCR signaling(B cell receptor signaling)
Gene expression Gene expressionMetabolism of RNA Metabolism of RNANonsense-mediated decay Nonsense-mediated decayTranscriptionmRNA processing(RNA transport)Metabolism of proteins (Metabolism of
proteins)Translation (Translation)Cell cycle Cell cycle
Cell cycle checkpointsRegulation of mitotic cell
cycleMitotic metaphase and
anaphaseG0 and early G1 G0 and early G1
G1 phase(G1S transition)G2M transition(DNA replicationS
phase)DNA repairDouble-strand break
repairSignaling pathways Signaling pathwaysInsulin signaling pathwaySignaling by NGF Signaling by NGF
Signaling by FGFRSignaling by NOTCHSignaling by WntSignaling by EGFRDevelopmentAxon guidanceMembrane traffickingGap junction trafficking
and regulationProgramed cell deathApoptosisHemostasisHemostasis
NOTEmdashKEGG and Reactome pathway databases were queried All significant termsare reported with Reactome terms grouped together under parent terms in italictext (full list of terms available in supplementary table S3 Supplementary Materialonline) Parent terms without brackets are the names of pathways that were sig-nificantly enriched parent terms in parentheses were not listed as significantlyenriched themselves but are reported as a guide to the category contents
6
In addition to the significant gene-level overlap betweenthe two studies we observed significant overlap between thePPI networks constructed from each set of candidate genes(summarized in fig 2) We examined the overlap by compar-ing the first-order networks 1) at the level of node overlap and2) at the level of edge (interaction) overlap Three hundredseven GWAS seeds formed a network of 3324 nodes whileour previous set of candidates mapped to 337 seeds forming anetwork that comprised 3215 nodes One thousand eighthundred twenty-six or ~55 of the nodes overlapped be-tween networks which was significantly higher than the nulldistribution estimated from 1000 simulations (mean simu-lated overlap SD 1304 143 nodes observed overlap atthe 999th percentile fig 2A and C) The overlap at the inter-action level was not significantly higher than expected (meansimulated overlap SD 25704 4024 edges observed over-lap 30686 edges 89th percentile fig 2B and D) Hive plotswere used to visualize the separate networks (supplementaryfig S4A and B Supplementary Material online) as well as theoverlap between the networks (supplementary fig S4CSupplementary Material online)
Given the high degree of shared nodes from distinct setsof protein seeds we compared the networks for biologicalfunction (fig 2E) Both networks produced long lists of sig-nificantly enriched GO terms (supplementary table S3Supplementary Material online) and assessing overlap was
difficult due to the nonindependent nature of GO hierar-chies However a comparison of functional pathway enrich-ment revealed some differences Although both networksshowed enrichment in the immune system pathways theribosome and Notch and NGF signaling pathways as well assome involvement of gene expression and translation path-ways the network constructed from the Telonis-Scott et al(2012) gene list showed extensive involvement of cell cyclepathway proteins as well as significant enrichment for Wntsignaling EGRF signaling DNA double-strand break repairaxon guidance gap junction trafficking and apoptosis path-ways which were not enriched in the GWAS network TheGWAS network was specifically enriched for dorso-ventralaxis formation oocyte maturation and many more geneexpression pathways than the Telonis-Scott et al (2012)gene network (table 2 and supplementary table S3Supplementary Material online)
Discussion
Genomic Signature of Desiccation Resistance fromNatural Standing Genetic Variation
We report the first large-scale genome-wide screen for nat-ural alleles associated with survival of low humidity condi-tions in D melanogaster By combining a powerful naturalassociation study with high-throughput Pool-seq we
Table 3 Overrepresentation of GO Categories for the 45 Core Candidate Genes Overlapping between the Current and Our Previous(Telonis-Scott et al 2012) Genomic Study
Term ID TermSubterms FDR P valuea Genes
Cellularbehavioral response to stimulus defense and signaling
GO0050896 Response to stimulus (cellular response tostimulus regulation of response tostimulus)
0007 Abd-B dnc fz mam nkd slo ct fru kluRbf klg santa-maria jbug fab1CG33275 Mtl Trim9 cv-c Gprk2 Ect4CG43658
GO0048585 Negative regulation of response to stimu-lus (Rho protein signal transductionsignal transduction)
0042 dnc mam slo fru Trim9 cv-c
GO0023052 Signaling (cAMP-mediated signaling cellcommunication)
0031 Amph Pde9 dnc fz mam nkd ct kluRbf santa-maria fab1 CG33275 MtlTrim9 cv-c Gprk2 Ect4 CG43658
Development
GO0048863 Stem cell differentiation 0014 Abd-D mam nkd klu nub
GO0014016 Neuroblast differentiation 0048
GO0045165 Cell fate commitment 0010 Abd-D dnc fz mam nkd ct klu klg nub
GO0035277 Spiracle morphogenesis open trachealsystem
0010 Abd-D ct cv-c
GO0007423 Sensory organ development (eye wingdisc imaginal disc)
0008 fz mam ct klu klg sdk Amph Nckx30cMtl
GO0048736 Appendage development 0020 fz mam ct fru CG33275 Mtl Fili nubcv-c Gprk2 CG43658
GO0007552 Metamorphosis 0001 fz mam ct fru CG33275 Mtl Fili cv-cGprk2 CG43658
Cellular component organization
GO0030030 Cell projection organization 0020 dnc fz ct fru GluClalpha Nckx30c MtlTrim9 nub cv-c
NOTEmdashNote that most genes are assigned to multiple annotations and therefore are shown purely as a guide to putative functionaBonferronindashHochberg FDR and gene length corrected P values using the Flymine v400 database Gene Ontology Enrichment The terms were condensed and trimmed usingREVIGO
7
mapped SNPs associated with desiccation resistance in 500naturally derived genotypes Although Pool-seq has limitedpower to reveal very low frequency alleles to reconstructphase or adequately estimate allele frequencies in smallpools (Lynch et al 2014) the Pool-GWAS approach hasbeen validated by retrieval of known candidate genes in-volved in body pigmentation using similar numbers of ge-notypes as assessed here (Bastide et al 2013)
Although studies in controlled genetic backgrounds reportlarge effects of single genes on desiccation resistance (Table 1)our current and previous screens revealed similarly largenumbers of variants (over 600) mapping to over 300 genesA polygenic basis of desiccation resistance is consistent withquantitative genetic data for the trait (Foley and Telonis-Scott2011) and we also anticipate more complexity from naturalD melanogaster populations with greater standing geneticvariation than inbred laboratory strains However somefalse positive associations are likely in the GWAS due to thelimitations of quantitative trait mapping approaches Pooledevolve-and-resequence experiments routinely yield vast num-bers of candidate SNPs partly due to high levels of standinglinkage disequilibrium (LD) and hitchhiking neutral alleles
(Schleurootterer et al 2015) Large population sizes from specieswith low levels of LD such as D melanogaster somewhat cir-cumvent this issue and a notable feature of our experimentaldesign is our sampling Here we selected the most desicca-tion-resistant phenotypes directly from substantial standinggenetic variation where one generation of recombinationshould largely reflect natural haplotypes
We observed the largest candidate allele frequency differ-ences in loci that had intermediate allele frequencies in thecontrol pool This is in line with population genetics modelswhich predict that evolution from standing genetic variationwill initially be strongest for intermediate frequency alleleswhich can facilitate rapid change (Long et al 2015) Howeverwe also observed a number of low frequency alleles which ifbeneficial can inflate false positives due to long-range LD(Orozco-terWengel et al 2012 Nuzhdin and Turner 2013Tobler et al 2014) Nonetheless spurious LD associationscannot fully account for our candidate list as evidenced bythe desiccation functional candidates and cross-study over-lap Furthermore chromosome inversion dynamics can alsogenerate both short and long-range LD and contribute nu-merous false positive SNP phenotype associations (Hoffmann
Fig 2 Observed versus simulated network overlap between the GWAS first-order PPI network and the first-order PPI network from the Telonis-Scottet al (2012) gene list Overlap is presented in terms of (A) node number and (B) edge number Networks were simulated by resampling from the entireDrosophila melanogaster gene list (r553) as described in the text The histogram shows the distribution of overlap measures for 1000 simulations Theblack line represents the histogram as density and the blue line shows the corresponding normal distribution The red vertical line shows the observedoverlap from the real networks labeled as a percentile of the normal distribution Area-proportional Venn diagrams summarized the extent of overlapfor PPI networks constructed separately from the GWAS candidates and Telonis-Scott et al (2012) candidates (labeled ldquoprevious studyrdquo) (C) Overlapexpressed as the number of nodes Approximately 55 of the nodes overlapped between the two studies which was significantly higher than simulatedexpectations (999 percentile 2B) (D) Overlap expressed in the number of interactions This was not significantly higher than expected by simulation(89th percentile 2B) (E) Overlap at the level of GO biological process terms (directly compared GO terms by name this was not tested statisticallybecause of the complex hierarchical nature of GO terms and is presented for interest)
8
and Weeks 2007 Tobler et al 2014) but we found little as-sociation between the desiccation pool and inversion fre-quency changes consistent with the lack of latitudinalpatterns in desiccation resistanceand traitinversion fre-quency associations in east Australian D melanogaster(Hoffmann et al 2001 Weeks et al 2002)
Cross-Study Comparison Reveals a Core Set ofCandidates for a Complex Trait
Our data contain many candidate genes functions and path-ways for insect desiccation resistance but partitioning genu-ine adaptive loci from experimental noise remains asignificant challenge We therefore focus our efforts on ournovel comparative analysis where we identified 45 robustcandidates for further exploration Common genetic signa-tures are rarely documented in cross-study comparisons ofcomplex traits (Sarup et al 2011 Huang et al 2012 but cfVermeulen et al 2013) and never before for desiccation re-sistance Some degree of overlap between our studies likelyresults from a shared genetic background from similar sam-pling sites highlighting the replicability of these responsesOne limitation of this approach is the overpopulation ofthe ldquocorerdquo list by longer genes This result reflects the genelength bias inherent to GWAS approaches an issue that is notstraightforward to correct in a GWAS let alone in the com-parative framework applied here A focus on the commongenes functionally linked to the desiccation phenotype canprovide a meaningful framework for future research and alsosuggests that some large-effect candidate genes exhibit natu-ral variation contributing to the phenotype (table 1)
Other feasible candidates include genes expressed inthe MTs (Wang et al 2004) or essential for MT develop-ment (St Pierre et al 2014) providing further researchopportunity focused on the primary stress sensing andfluid secretion tissues Genes include one of the mosthighly expressed MT transcripts CG7084 as well as dncNckx30C slo cv-c and nkd Other aspects of stress resis-tance are also implicated CrebB for example is involvedin regulating insulin-regulated stress resistance and ther-mosensory behavior (Wang et al 2008) nkd also impactslarval cuticle development and shares a PDZ signalingdomain also present in the desiccation candidate desi(Kawano et al 2010 St Pierre et al 2014) Finally severalcore candidates were consistently reported in recent stud-ies suggesting generalized roles in stress responses andfitness including transcriptome studies on MT function(Wang et al 2004) inbreeding and cold resistance(Vermeulen et al 2013) oxidative stress (Weber et al2012) and genomic studies exploring age-specific SNPson fitness traits (Durham et al 2014) and genes likelyunder spatial selection in flies from climatically divergenthabitats (Fabian et al 2012)
Several of the enriched GO categories from the core listalso make biological sense including stimulus response(almost half the suite) defense and signaling particularlyin pathways involved in MT stress sensing (Davies et al 20122013 2014) Functional categories enriched for sensory
organ development were highlighted consistent with envi-ronmental sensing categories from desert Drosophila tran-scriptomics (Matzkin and Markow 2009 Rajpurohit et al2013) and stress detection via phototransduction pathwaysin D melanogaster under a range of stressors including des-iccation (Soslashrensen et al 2007)
Evidence for Conserved Signatures at Higher LevelOrganization
Beyond the gene level we observed a striking degree ofsimilarity between this study and that of Telonis-Scottet al (2012) at the network level where significantly over-lapping protein networks were constructed from largely dif-ferent seed protein sets This supports a scenario where theldquobiological information flow from DNA to phenotyperdquo(Civelek and Lusis 2014) contains inherent redundancywhere alternative genetic solutions underlie phenotypesand functions Resistance to perturbation by genetic or en-vironmental variation appears to increase with increasinghierarchical complexity that is phenotype ldquobufferingrdquo (Fuet al 2009) The same functional network in different pop-ulationsgenetic backgrounds can be impacted but theexact genes and SNPs responding to perturbation will notnecessarily overlap In Drosophila different sets of loci fromthe same founders mapped to the same PPI network in thecase of two complex traits chill coma recovery and startleresponse (Huang et al 2012) Furthermore interspecificshared networkspathways from different genes have beendocumented for numerous complex traits (Emilsson et al2008 Aytes et al 2014 Ichihashi et al 2014) providing op-portunities for investigating taxa where individual genefunction is poorly understood
Although the null distribution is not always obviouswhen comparing networks and their functions partly be-cause of reduced sensitivity from missing interactions(Leiserson et al 2013) here the shared network signaturesportray a plausible multifaceted stress response involvingstress sensing rapid gene accessibility and expression Thepredominance of immune system and stress responsepathways are of interest with reference immunitystresspathway ldquocross-talkrdquo particularly in the MTs whereabiotic stressors elicit strong immunity transcriptionalresponses (Davies et al 2012) The higher order architec-tures reflect crucial aspects of rapid gene expression con-trol in response to stress from chromatin organization(Alexander and Lomvardas 2014) to transcription RNAdecay and translation (reviwed in de Nadal et al 2011) InDrosophila variation in transcriptional regulation of indi-vidual genes can underpin divergent desiccation pheno-types (Chung et al 2014) or be functionally inferred frompatterns of many genes elicited during desiccation stress(Rajpurohit et al 2013) Variants with potential to mod-ulate expression at different stages of gene expressioncontrol were evident in our GWAS data alone for exam-ple enrichment for 50UTR and splice variants at the SNPlevel and for chromatin and chromosome structure termsat the functional level
9
Genetic Overlap Versus Genetic differencesmdashCausesand Implications
Despite the common signatures at the gene and higher orderlevels between our studies pervasive differences were ob-served These can arise from differences in standing geneticvariation in the natural populations where different alleliccompositions impact gene-by-environment interactions andepistasis Our network overlap analyses highlight the potentialfor different molecular trajectories to result in similar func-tions (Leiserson et al 2013) consistent with the multiplephysiological pathways potentially available to D melanoga-ster under water stress (Chown et al 2011) Experimentaldesign also presumably contributed to the study differencesStrong directional selection as applied in the 2012 study canincrease the frequency of rare beneficial alleles and likely re-duced overall diversity than did the single-generation GWASdesign Furthermore selection regimes often impose addi-tional selective pressures (eg food deprivation during desic-cation) resulting in correlated responses such as starvationresistance with desiccation resistance and these factors differfrom the laboratory to the field Finally evidence suggests innatural D melanogaster that standing genetic variation canvary extensively with sampling season (Itoh et al 2010Bergland et al 2014) Although we cannot compare levelsof standing variation between our two studies evidence sug-gests that this may have affected the Dys gene where ourGWAS approach revealed no differentiation between thecontrol and resistant pools at this locus but rather detectedthe alleles associated with the selective sweep in Telonis-Scottet al (2012) in both pools at high frequency Whether thisresulted from sampling the later collection after a longdrought (2004ndash2008 wwwbomgovauclimate last accessedJanuary 13 2016) is unclear but this observation highlightsthe relevance of temporal sampling to standing genetic var-iation when attempting to link complex phenotypes togenotypes
Materials and Methods
Natural Population Sampling
Drosophila melanogaster was collected from vineyard waste atKilchorn Winery Romsey Australia in April 2010 Over 1200isofemale lines were established in the laboratory from wild-caught females Species identification was conducted on maleF1 to remove Drosophila simulans contamination and over900 D melanogaster isofemale lines were retained All flieswere maintained in vials of dextrose cornmeal media at25 C and constant light
Desiccation Assay
From each isofemale line ten inseminated F1 females weresorted by aspiration without CO2 and held in vials until 4ndash5days old For the desiccation assay 10 females per line (over9000 flies in total) were screened by placing groups of femalesinto empty vials covered with gauze and randomly assigninglines into 5 large sealed glass tanks containing trays of silicadesiccant (relative humidity [RH] lt10) The tanks werescored for mortality hourly and the final 558 surviving
individuals were selected for the desiccation-resistant pool(558 flies lt6 of the total) The control pool constitutedthe same number of females sampled from over 200 ran-domly chosen isofemale lines that were frozen prior to thedesiccation assay We chose this random control approachinstead of comparing the phenotypic extremes (ie the latemortality tail vs the ldquoearly mortalityrdquo tail) because unfavor-able environmental conditions experienced by the wild-caught dam may inflate environmental variance due to car-ryover effects in the F1 offspring (Schiffer et al 2013)
DNA Extraction and Sequencing
We used pooled whole-genome sequencing (Pool-seq)(Futschik and Schleurootterer 2010) to estimate and compareallele frequency differences between the most desiccation-resistant and randomly sampled flies from a wild populationto associate naturally segregating candidate SNPs with theresponse to low humidity For each treatment (resistantand control) genomic DNA was extracted from 50 femaleheads using the Qiagen DNeasy Blood and Tissue Kit accord-ing to the manufacturerrsquos instructions (Qiagen HildenGermany) Two extractions were combined to create fivepools per treatment which were barcoded separately tocreate technical replicates (total 10 pools of 100 fliesn frac14 1000) The DNA was fragmented using a CovarisS2 ma-chine (Covaris Inc Woburn MA) and libraries were pre-pared from 1 mg genomic DNA (per pool of 100 flies) usingthe Illumina TruSeq Prep module (Illumina San Diego CA)To minimize the effect of variation across lanes the ten poolswere combined in equal concentration (fig 3) and run mul-tiplexed in each lane of a full flow cell of an Illumina HiSeq2000 sequencer Clusters were generated using the TruSeq PECluster Kit v5 on a cBot and sequenced using Illumina TruSeqSBS v3 chemistry (Illumina) Fragmentation library construc-tion and sequencing were performed at the Micromon Next-Gen Sequencing facility (Monash University ClaytonAustralia)
Fig 3 Experimental design See text for further explanation
10
Data Processing and Mapping
Processing was performed on a high-performance computingcluster using the Rubra pipeline system that makes use of theRuffus Python library (Goodstadt 2010) The final variant-calling pipeline was based on a pipeline developed by ClareSloggett (Sloggett et al 2014) and is publicly available athttpsgithubcomgriffinpGWAS_pipeline (last accessedJanuary 13 2016) Raw sequence reads were processed persample per lane with P1 and P2 read files processed sepa-rately initially Trimming was performed with trimmomatic v030 (Lohse et al 2012) Adapter sequences were also re-moved Leading and trailing bases with a quality scorebelow 30 were trimmed and a sliding window (width frac14 10quality threshold frac14 25) was used Reads shorter than 40 bpafter trimming were discarded Quality before and after trim-ming was examined using FastQC v 0101 (Andrews 2014)
Posttrimming reads that remained paired were used asinput into the alignment step Alignment to the D melano-gaster reference genome version r553 was performed withbwa v 062 (Li and Durbin 2009) using the bwa aln commandand the following options No seeding (-l 150) 1 rate ofmissing alignments given 2 uniform base error rate (-n001) a maximum of 2 gap opens per read (-o 2) a maximumof 12 gap extensions per read (-e 12) and long deletionswithin 12 bp of 30 end disallowed (-d 12) The default valueswere used for all other options Output files were then con-verted to sorted bam files using the bwa sampe commandand the ldquoSortSamrdquo option in Picard v 196
At this point the bam files were merged into one file persample using PicardMerge Duplicates were identified withPicard MarkDuplicates The tool RealignerTargetCreator inGATK v 26-5 (DePristo et al 2011) was applied to the tenbam files simultaneously producing a list of intervals contain-ing probable indels around which local realignment would beperformed This was used as input for the IndelRealigner toolwhich was also applied to all ten bam files simultaneously
To investigate the variation among technical replicatesallele frequency was estimated based on read counts foreach technical replicate separately for 100000 randomly cho-sen SNPs across the genome After excluding candidate des-iccation-resistance SNPs and SNPs that may have been falsepositives due to sequencing error (those with mean frequencylt004 across replicates) 1803 SNPs remained For each locuswe calculated the variance in allele frequency estimate amongthe five control replicates and the variance among the fivedesiccation-resistance replicates We compared this with thevariance due to pool category (control vs desiccation resis-tant) using analysis of variance (ANOVA) We also calculatedthe mean pairwise difference in allele frequency estimateamong control and among desiccation-resistance replicatesand the mean concordance correlation coefficient was calcu-lated over all pairwise comparisons within pool category(using the epiccc function in the epiR package Stevensonet al 2015)
The following approach was then taken to identify SNPsdiffering between the desiccation-resistant and controlgroups Based on the technical replicate analysis we merged
the five ldquodesiccation-resistantrdquo and the five random controlsamples into two bam files A pileup file was created withsamtools v 0119 (Li et al 2009) and converted to a ldquosyncrdquo fileusing the mpileup2sync script in Popoolation2 v 1201 (Kofleret al 2011) retaining reads with a mapping quality of at least20 and bases with a minimum quality score of 20 Reads withunmapped mates were also retained (option ndashA in samtoolsmpileup) For this step we used a repeat-masked version ofthe reference genome created with RepeatMasker 403 (Smitet al 2013-2015) and a repeat file containing all annotatedtransposons from D melanogaster including shared ancestralsequences (Flybase release r557) plus all repetitive elementsfrom RepBase release v1904 for D melanogaster Simple re-peats small RNA genes and bacterial insertions were notmasked (options ndashnolow ndashnorna ndashno_is) The rmblastsearch engine was used
Association Analysis
We then performed Fisherrsquos exact test with the Fisher testscript in Popoolation2 (Kofler et al 2011) We set the min-imum count (of the minor allele) frac14 10 minimum coverage(in both populations) frac14 20 and maximum coverage frac1410000 The test was performed for each SNP (slidingwindow off) Although there are numerous approaches tocalculating genome-wide significance thresholds in aPoolseq framework currently there is no consensus on anappropriate standardized statistical method Methods in-clude simulation approaches (Orozco-terWengel et al2012 Bastide et al 2013 Burke et al 2014) correctionusing a null distribution based on estimating genetic driftfrom replicated artificial selection line allele frequency vari-ance (Turner and Miller 2012) use of a simple thresholdbased on a minimum allele frequency difference (Turneret al 2010) and the FDR correction suggested by Storeyand Tibshirani (2003) (used by Fabian et al 2012)
We determined based on our design to calculate the FDRthreshold by comparison with a simulated distribution of nullP values following the approach of Bastide et al (2013) Foreach SNP locus a P value was calculated by simulating thedistribution of read counts between the major and minorallele according to a beta-binomial distribution with meana(a+b) and variance (ab)((a+b)2(a+b+1)) Thismodel allowed for two types of sampling variationStochastic variation in allele sampling across the phenotypesand the background variation in the representation of allelesin each pool caused by library preparation artifacts or sto-chastic variation driven by unequal coverage of the poolsThese sources of variation have also been recognized else-where as being important in interpreting pooled sequenceallele calls and identifying significant differentiation betweenpooled samples (Lynch et al 2014) The value of afrac14 37 waschosen by chi-square test as the best match to the observedP value distribution (afrac14 10 to afrac14 40 were tested) with acorresponding bfrac14 4315 to reflect the coverage depth differ-ence between the random control (487) and desiccation-resistant (568) pools These values were then used tosimulate a null data set 10 the size of the observed data
11
set Because the null P value distribution still did not fit theobserved P value distribution particularly well (supplemen-tary fig S2 Supplementary Material online) these values werenot used for a standard FDR correction Instead a P valuethreshold was calculated based on the lowest 005 of thenull P values similar in approach to Orozco-terWengel et al(2012)
Inversion Analysis
In D melanogaster several cosmopolitan inversion polymor-phisms show cross-continent latitudinal patterns that explaina substantial proportion of clinal variation for many traitsincluding thermal tolerance (reviewed in Hoffmann andWeeks 2007) Patterns of disequilibria generated betweenloci in the vicinity of the rearrangement can obscure signalsof selection (Hoffmann and Weeks 2007) and inversion fre-quencies are routinely considered in genomic studies ofnatural populations sampled from known latitudinal clines(Fabian et al 2012 Kapun et al 2014) We assessedassociations between inversion frequencies and our controland resistant pools at SNPs diagnostic for the following in-versions In3R(Payne) (Anderson et al 2005 Kapun et al2014) In(2L)t (Andolfatto et al 1999 Kapun et al 2014)In(2R)Ns In(3L)P In(3R)C In(3R)K and In(3R)Mo (Kapunet al 2014) Between one and five diagnostic SNPs were in-vestigated for each inversion (depending on availability) Weconsidered an inversion to be associated with desiccationresistance if the frequency of its diagnostic SNP differed sig-nificantly between the control and desiccation-resistant poolsby a Fisherrsquos exact test
Genome Feature Annotation
The list of candidate SNPs was converted to vcf format using acustom R script setting the alternate allele as the allele thatincreased in frequency from the control to the desiccation-resistant pool and setting the reference allele as the otherallele present in cases where both the major and minor allelesdiffered from the reference genome The genomic features inwhich the candidate SNPs occurred were annotated usingsnpEff v40 first creating a database from the D melanogasterv553 genome annotation with the default approach(Cingolani et al 2012) The default annotation settings wereused except for setting the parameter ldquo-ud 0rdquo to annotateSNPs to genes only if they were within the gene body Using2 tests on SNP counts we tested for over- or under-enrichment of feature types by calculating the proportionof lsquoallrsquo features from all SNPs detected in the study and theproportion of ldquocandidaterdquo SNPs identified as differentiated indesiccation-resistant flies (Fabian et al 2012)
GO Analysis
GO analyses was used to associate the candidates with theirbiological functions (Ashburner et al 2000) Previously devel-oped for biological pathway analysis of global gene expressionstudies (Wang et al 2010) typical GO analysis assumes that allgenes are sampled independently with equal probabilityGWAS violate these criteria where SNPs are more likely to
occur in longer genes than in shorter genes resulting in highertype I error rates in GO categories overpopulated with longgenes Gene length as well as overlapping gene biases (Wanget al 2011) were accounted for using Gowinda (Kofler andSchleurootterer 2012) which implements a permutation strategyto reduce false positives Gowinda does not reconstruct LDbetween SNPs but operates in two modes underpinned bytwo extreme assumptions 1) Gene mode (assumes all SNPs ina gene are in LD) and 2) SNP mode (assumes all SNPs in agene are completely independent)
The candidates were examined in both modes but giventhe stringency of gene mode high-resolution SNP modeoutput was utilized for the final analysis with the followingparameters 106 simulations minimum significance frac141 andminimum genes per GO categoryfrac14 3 (to exclude gene-poorcategories) P values were corrected using the Gowinda FDRalgorithm GO annotations from Flybase r557 were applied
Candidate Gene Analysis
In conjunction with the GO analysis we also examinedknown candidate genes associated with survival under lowRH to better characterize possible biological functions of ourGWAS candidates We took a conservative approach to can-didate gene assignment where the SNPs were mapped withinthe entire gene region without up- and downstream exten-sions Genes were curated from FlyBase and the literature(table 1) predominantly using detailed functional experimen-tal evidence as criteria for functional desiccation candidates
Network Analysis
We next obtained a higher level summary of the candidategene list using PPI network analysis The full candidate genelist was used to construct a first-order interaction networkusing all Drosophila PPIs listed in the Drosophila InteractionDatabase v 2014_10 (avoiding interologs) (Murali et al 2011)This ldquofullrdquo PPI network includes manually curated data frompublished literature and experimental data for 9633 proteinsand 93799 interactions For the network construction theigraph R package was used to build a subgraph from the fullPPI network containing the candidate proteins and their first-order interacting proteins (Csardi and Nepusz 2006) Self-connections were removed Functional enrichment analysiswas conducted on the list of network genes using FlyMinev400 (wwwflymineorg last accessed January 13 2016) withthe following parameters for GO enrichment and KEGGReactome pathway enrichment BenjaminindashHochberg FDRcorrection P lt 005 background frac14 all D melanogastergenes in our full PPI network and Ontology (for GO enrich-ment only) frac14 biological process or molecular function
Cross-Study Comparison Overlap with Our PreviousWork
Following on from the comparison between our gene list andthe candidate genes suggested from the literature we quan-titatively evaluated the degree of overlap between the candi-date genes identified in this study with alleles mapped in ourprevious study of flies collected from the same geographic
12
region (Telonis-Scott et al 2012) The ldquooverlaprdquo analysis wasperformed similarly for the GWAS candidate gene and net-work-level approaches First the gene list overlap was testedbetween the two studies using Fisherrsquos exact test where the382 genes identified from the candidate SNPs (this study)were compared with 416 genes identified from candidatesingle feature polymorphisms (Telonis-Scott et al 2012)The contingency table was constructed using the numberof annotated genes in the D melanogaster R553 genomebuilt as an approximation for the total number of genestested in each study and took the following form
where A frac14 gene list from Telonis-Scott et al (2012)n frac14 416 B frac14 gene list from this study n frac14 382 N frac14 totalnumber of genes in D melanogaster reference genome r553n frac14 17106
To annotate genome features in the overlap candidategene list we used the SNP data (this study) which betterresolves alleles than array-based genotyping Due to this lim-itation we did not compare exact genomic coordinates be-tween the studies but considered overlap to occur whendifferentiated alleles were detected in the same gene inboth studies We did however examine the allele frequenciesaround the Dys-RF promoter which was sequenced sepa-rately revealing a selective sweep in the experimental evolu-tion lines (Telonis-Scott et al 2012)
The overlapping candidates were tested for functional en-richment using a basic gene list approach As genes ratherthan SNPs were analyzed here Fly Mine v400 (wwwflymineorg) was applied with the following parameters Gene lengthnormalization BenjaminindashHochberg FDR correction Plt 005Ontology frac14 biological process and default background (allGO-annotated D melanogaster genes)
Finally we constructed a second PPI network from thecandidate genes mapped in Telonis-Scott et al (2012) to ex-amine system-level overlap between the two studies The 416genes mapped to 337 protein seeds and the network wasconstructed as described above The degree of network over-lap between the two studies was tested using simulation Ineach iteration a gene set of the same length as the list fromthis study and a gene set of the same length as the list fromTelonis-Scott et al (2012) were resampled randomly from theentire D melanogaster mapped gene annotation (r553) Eachresampled set was then used to build a first-order network aspreviously described The overlap network between the tworesampled sets was calculated using the graphintersectioncommand from the igraph package and zero-degree nodes(proteins with no interactions) were removed Overlap wasquantified by counting the number of nodes and the numberof edges in this overlap network The observed overlap wascompared with a simulated null distribution for both nodeand edge number where n frac14 1000 simulation iterations
Supplementary MaterialSupplementary figures S1ndashS4 and tables S1ndashS5 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
We thank Robert Good for the fly sampling and JenniferShirriffs and Lea and Alan Rako for outstanding assistancewith laboratory culture and substantial desiccation assaysWe are grateful to Melissa Davis for advice on network build-ing and comparison approaches and to five anonymous re-viewers whose comments improved the manuscript Thiswork was supported by a DECRA fellowship DE120102575and Monash University Technology voucher scheme toMTS an ARC Future Fellowship and Discovery Grants toCMS an ARC Laureate Fellowship to AAH and a Scienceand Industry Endowment Fund grant to CMS and AAH
ReferencesAlexander JM Lomvardas S 2014 Nuclear architecture as an epigenetic
regulator of neural development and function Neuroscience26439ndash50
Anderson AR Hoffmann AA McKechnie SW Umina PA Weeks AR2005 The latitudinal cline in the In(3R)Payne inversion polymor-phism has shifted in the last 20 years in Australian Drosophilamelanogaster populations Mol Ecol 14851ndash858
Andolfatto P Wall JD Kreitman M 1999 Unusual haplotype structureat the proximal breakpoint of In(2L)t in a natural population ofDrosophila melanogaster Genetics 1531297ndash1311
Andrews S 2014 FastQC Available from httpwwwbioinformaticsbbsrcacukprojectsfastqc
Ashburner M Ball CA Blake JA Botstein D Butler H Cherry JM DavisAP Dolinski K Dwight SS Eppig JT et al 2000 Gene ontology toolfor the unification of biology The Gene Ontology Consortium NatGenet 2525ndash29
Aytes A Mitrofanova A Lefebvre C Alvarez Mariano J Castillo-MartinM Zheng T Eastham James A Gopalan A Pienta Kenneth J ShenMM et al 2014 Cross-species regulatory network analysis identifiesa synergistic interaction between FOXM1 and CENPF that drivesprostate cancer malignancy Cancer Cell 25638ndash651
Bastide H Betancourt A Nolte V Tobler R Steuroobe P Futschik ASchleurootterer C 2013 A genome-wide fine-scale map of natural pig-mentation variation in Drosophila melanogaster PLoS Genet9e1003534
Bergland AO Behrman EL OrsquoBrien KR Schmidt PS Petrov DA 2014Genomic evidence of rapid and stable adaptive oscillations overseasonal time scales in Drosophila PLoS Genet 10e1004775
Bonebrake TC Mastrandrea MD 2010 Tolerance adaptation and pre-cipitation changes complicate latitudinal patterns of climate changeimpacts Proc Natl Acad Sci U S A 10712581ndash12586
Bradley TJ 2009 Animal osmoregulation Oxford Oxford UniversityPress
Burke MK Liti G Long AD 2014 Standing genetic variation drivesrepeatable experimental evolution in outcrossing populations ofSaccharomyces cerevisiae Mol Biol Evol 313228ndash3239
Chown SL 2012 Trait-based approaches to conservation physiologyforecasting environmental change risks from the bottom upPhilos Trans R Soc B 3671615ndash1627
Chown SL Nicolson SW 2004 Insect physiological ecology mechanismsand patterns Oxford Oxford University Press
Chown SL Soslashrensen JG Terblanche JS 2011 Water loss in insectsan environmental change perspective J Insect Physiol571070ndash1084
Not in GWAS In GWAS
Not in Telonis-Scott et al (2012) N ndash intersect-AndashB A-intersect
In Telonis-Scott et al (2012) B-intersect Intersect(AB)
13
Chung H Loehlin DW Dufour HD Vaccarro K Millar JG Carroll SB2014 A single gene affects both ecological divergence and matechoice in Drosophila Science 3431148ndash1151
Cingolani P Platts A Wang LL Coon M Nguyen T Wang L Land SJ LuX Ruden DM 2012 A program for annotating and predicting theeffects of single nucleotide polymorphisms SnpEff SNPs in thegenome of Drosophila melanogaster strain w1118 iso-2 iso-3 Fly(Austin) 680ndash92
Civelek M Lusis AJ 2014 Systems genetics approaches to understandcomplex traits Nat Rev Genet 1534ndash48
Clusella-Trullas S Blackburn TM Chown SL 2011 Climatic predictors oftemperature performance curve parameters in ectotherms implycomplex responses to climate change Am Nat 177738ndash751
Csardi G Nepusz T 2006 The igraph software package for complexnetwork research InterJournal Complex Systems 1695 Availablefrom httpigraphorg
Davies SA Cabrero P Overend G Aitchison L Sebastian S Terhzaz SDow JA 2014 Cell signalling mechanisms for insect stress toleranceJ Exp Biol 217119ndash128
Davies SA Cabrero P Povsic M Johnston NR Terhzaz S Dow JA 2013Signaling by Drosophila capa neuropeptides Gen Comp Endocrinol18860ndash66
Davies SA Overend G Sebastian S Cundall M Cabrero P Dow JATerhzaz S 2012 Immune and stress response lsquocross-talkrsquo in theDrosophila Malpighian tubule J Insect Physiol 58488ndash497
Day JP Dow JA Houslay MD Davies SA 2005 Cyclic nucleotide phos-phodiesterases in Drosophila melanogaster Biochem J 388333ndash342
de Nadal E Ammerer G Posas F 2011 Controlling gene expression inresponse to stress Nat Rev Genet 12833ndash845
DePristo MA Banks E Poplin R Garimella KV Maguire JR Hartl CPhilippakis AA del Angel G Rivas MA Hanna M et al 2011 Aframework for variation discovery and genotyping using next-gen-eration DNA sequencing data Nat Genet 43491ndash498
Djawdan M Chippindale AK Rose MR Bradley TJ 1998 Metabolic re-serves and evolved stress resistance in Drosophila melanogasterPhysiol Zool 71584ndash594
Durham MF Magwire MM Stone EA Leips J 2014 Genome-wide anal-ysis in Drosophila reveals age-specific effects of SNPs on fitness traitsNat Commun 54338
Emilsson V Thorleifsson G Zhang B Leonardson AS Zink F Zhu JCarlson S Helgason A Walters GB Gunnarsdottir S et al 2008Genetics of gene expression and its effect on disease Nature452423ndash428
Fabian DK Kapun M Nolte V Kofler R Schmidt PS Schleurootterer C FlattT 2012 Genome-wide patterns of latitudinal differentiation amongpopulations of Drosophila melanogaster from North America MolEcol 214748ndash4769
Foley BR Telonis-Scott M 2011 Quantitative genetic analysis suggestscausal association between cuticular hydrocarbon composition anddesiccation survival in Drosophila melanogaster Heredity 10668ndash77
Folk DG Han C Bradley TJ 2001 Water acquisition and partitioning inDrosophila melanogaster effects of selection for desiccation-resis-tance J Exp Biol 2043323ndash3331
Fowler MA Montell C 2013 Drosophila TRP channels and animal be-havior Life Sci 92394ndash403
Fu J Keurentjes JJB Bouwmeester H America T Verstappen FWA WardJL Beale MH de Vos RCH Dijkstra M Scheltema RA et al 2009System-wide molecular evidence for phenotypic buffering inArabidopsis Nat Genet 41166ndash167
Futschik A Schleurootterer C 2010 The next generation of molecular mar-kers from massively parallel sequencing of pooled DNA samplesGenetics 186207ndash218
Gibbs AG Fukuzato F Matzkin LM 2003 Evolution of water conserva-tion mechanisms in Drosophila J Exp Biol 2061183ndash1192
Gibbs AG Matzkin LM 2001 Evolution of water balance in the genusDrosophila J Exp Biol 2042331ndash2338
Goodstadt L 2010 Ruffus a lightweight Python library for computa-tional pipelines Bioinformatics 262778ndash2779
Hadley NF 1994 Water relations of terrestrial arthropods San Diego(CA) Academic Press
Hoffmann AA Hallas R Sinclair C Mitrovski P 2001 Levels of variationin stress resistance in Drosophila among strains local populationsand geographic regions patterns for desiccation starvation coldresistance and associated traits Evolution 551621ndash1630
Hoffmann AA Parsons PA 1989a An integrated approach to environ-mental stress tolerance and life-history variation desiccation toler-ance in Drosophila Biol J Linn Soc 37117ndash136
Hoffmann AA Parsons PA 1989b Selection for increased desiccationresistance in Drosophila melanogaster additive genetic control andcorrelated responses for other stresses Genetics 122837ndash845
Hoffmann AA Sgro CM 2011 Climate change and evolutionary adap-tation Nature 470479ndash485
Hoffmann AA Weeks AR 2007 Climatic selection on genes and traitsafter a 100 year-old invasion a critical look at the temperate-tropicalclines in Drosophila melanogaster from eastern Australia Genetica129133ndash147
Huang W Richards S Carbone MA Zhu D Anholt RR Ayroles JFDuncan L Jordan KW Lawrence F Magwire MM et al 2012Epistasis dominates the genetic architecture of Drosophila quanti-tative traits Proc Natl Acad Sci U S A 10915553ndash15559
Huang Z Tunnacliffe A 2004 Response of human cells to desiccationcomparison with hyperosmotic stress response J Physiol558181ndash191
Huang Z Tunnacliffe A 2005 Gene induction by desiccation stress inhuman cell cultures FEBS Lett 5794973ndash4977
Ichihashi Y Aguilar-Martınez JA Farhi M Chitwood DH Kumar RMillon LV Peng J Maloof JN Sinha NR 2014 Evolutionary develop-mental transcriptomics reveals a gene network module regulatinginterspecific diversity in plant leaf shape Proc Natl Acad Sci U S A111E2616ndashE2621
Itoh M Nanba N Hasegawa M Inomata N Kondo R Oshima MTakano-Shimizu T 2010 Seasonal changes in the long-distance link-age disequilibrium in Drosophila melanogaster J Hered 10126ndash32
Johnson WA Carder JW 2012 Drosophila nociceptors mediate larvalaversion to dry surface environments utilizing both the painless TRPchannel and the DEGENaC subunit PPK1 PLoS One 7e32878
Kahsai L Kapan N Dircksen H Winther AM Nassel DR 2010 Metabolicstress responses in Drosophila are modulated by brain neurosecre-tory cells that produce multiple neuropeptides PLoS One 5e11480
Kapun M van Schalkwyk H McAllister B Flatt T Schleurootterer C 2014Inference of chromosomal inversion dynamics from Pool-Seq datain natural and laboratory populations of Drosophila melanogasterMol Ecol 231813ndash1827
Kawano T Shimoda M Matsumoto H Ryuda M Tsuzuki S Hayakawa Y2010 Identification of a gene Desiccate contributing to desicca-tion resistance in Drosophila melanogaster J Biol Chem28538889ndash38897
Kellermann V van Heerwaarden B Sgro CM Hoffmann AA 2009Fundamental evolutionary limits in ecological traits driveDrosophila species distributions Science 3251244ndash1246
Kofler R Nolte V Schleurootterer C 2016 The impact of library preparationprotocols on the consistency of allele frequency estimates in Pool-Seq data Mol Ecol Resour 16118ndash122
Kofler R Pandey RV Schleurootterer C 2011 PoPoolation2 identifying dif-ferentiation between populations using sequencing of pooled DNAsamples (Pool-Seq) Bioinformatics 273435ndash3436
Kofler R Schleurootterer C 2012 Gowinda unbiased analysis of gene setenrichment for genome-wide association studies Bioinformatics282084ndash2085
Leiserson MDM Eldridge JV Ramachandran S Raphael BJ 2013Network analysis of GWAS data Curr Opin Genet Dev 23602ndash610
Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 251754ndash1760
Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079
14
Liu Y Luo J Carlsson MA Neuroassel DR 2015 Serotonin and insulin-likepeptides modulate leucokinin-producing neurons that affectfeeding and water homeostasis in Drosophila J Comp Neurol5231840ndash1863
Lohse M Bolger AM Nagel A Fernie AR Lunn JE Stitt M Usadel B 2012RobiNA a user-friendly integrated software solution for RNA-Seq-based transcriptomics Nucleic Acids Res 40W622ndashW627
Long A Liti G Luptak A Tenaillon O 2015 Elucidating the moleculararchitecture of adaptation via evolve and resequence experimentsNat Rev Genet 16567ndash582
Lynch M Bost D Wilson S Maruki T Harrison S 2014 Population-genetic inference from pooled-sequencing data Genome Biol Evol61210ndash1218
Mackay TFC Stone EA Ayroles JF 2009 The genetics of quantitativetraits challenges and prospects Nat Rev Genet 10565ndash577
Marron MT Markow TA Kain KJ Gibbs AG 2003 Effects of starvationand desiccation on energy metabolism in desert and mesicDrosophila J Insect Physiol 49261ndash270
Matzkin LM Markow TA 2009 Transcriptional regulation of metabo-lism associated with the increased desiccation resistance of thecactophilic Drosophila mojavensis Genetics 1821279ndash1288
Murali T Pacifico S Yu J Guest S Roberts GG Finley RL Jr 2011 DroID2011 a comprehensive integrated resource for protein transcrip-tion factor RNA and gene interactions for Drosophila Nucleic AcidsRes 39D736ndashD743
Nuzhdin SV Turner TL 2013 Promises and limitations of hitchhikingmapping Curr Opin Genet Dev 23694ndash699
Orozco-terWengel P Kapun M Nolte V Kofler R Flatt T Schleurootterer C2012 Adaptation of Drosophila to a novel laboratory environmentreveals temporally heterogeneous trajectories of selected alleles MolEcol 214931ndash4941
Picard Picard A set of Java command line tools for manipulating high-throughput seqeuncing (HTS) data and formats Available fromhttpbroadinstitutegithubiopicard
Qiu Y Tittiger C Wicker-Thomas C Le Goff G Young S Wajnberg EFricaux T Taquet N Blomquist GJ Feyereisen R 2012 An insect-specific P450 oxidative decarbonylase for cuticular hydrocarbon bio-synthesis Proc Natl Acad Sci U S A 10914858ndash14863
Rajpurohit S Oliveira CC Etges WJ Gibbs AG 2013 Functional genomicand phenotypic responses to desiccation in natural populations of adesert Drosophilid Mol Ecol 222698ndash2715
Sarup P Soslashrensen JG Kristensen TN Hoffmann AA Loeschcke V PaigeKN Soslashrensen P 2011 Candidate genes detected in transcriptomestudies are strongly dependent on genetic background PLoS One6e15644
Schiffer M Hangartner S Hoffmann AA 2013 Assessing the relativeimportance of environmental effects carry-over effects and speciesdifferences in thermal stress resistance a comparison ofDrosophilids across field and laboratory generations J Exp Biol2163790ndash3798
Schleurootterer C Kofler R Versace E Tobler R Franssen SU 2015Combining experimental evolution with next-generation sequen-cing A powerful tool to study adaptation from standing geneticvariation Heredity 114431ndash440
Sinclair BJ Gibbs AG Roberts SP 2007 Gene transcription during ex-posure to and recovery from cold and desiccation stress inDrosophila melanogaster Insect Mol Biol 16435ndash443
Sloggett C Wakefield M Philip G Pope B 2014 Rubramdashflexible distrib-uted pipelines figshare Available from httpdxdoiorg106084m9figshare895626
Smit AFA Hubley R Green P 2013-2015 RepeatMasker Open-403Available from httpwwwrepeatmaskerorg
Seurooderberg JA Birse RT Neuroassel DR 2011 Insulin production and signalingin renal tubules of Drosophila is under control of tachykinin-relatedpeptide and regulates stress resistance PLoS One 6e19866
Soslashrensen JG Nielsen MM Loeschcke V 2007 Gene expression profileanalysis of Drosophila melanogaster selected for resistance to envi-ronmental stressors J Evol Biol 201624ndash1636
St Pierre SE Ponting L Stefancsik R McQuilton P 2014 FlyBase102mdashadvanced approaches to interrogating FlyBase Nucleic AcidsRes 42D780ndashD788
Stevenson M Nunes T Heuser C Marshall J Sanchez J Thornton RReiczigel J Robison-Cox J Sebastiani P Solymos P et al 2015 epiRTools for the analysis of epidemiological data R package version09-62 Available from httpCRANR-projectorgpackagefrac14epiR
Storey JD Tibshirani R 2003 Statistical significance for genomewidestudies Proc Natl Acad Sci U S A 1009440ndash9445
Telonis-Scott M Gane M DeGaris S Sgro CM Hoffmann AA 2012 Highresolution mapping of candidate alleles for desiccation resistancein Drosophila melanogaster under selection Mol Biol Evol291335ndash1351
Telonis-Scott M Guthridge KM Hoffmann AA 2006 A new set oflaboratory-selected Drosophila melanogaster lines for the analysisof desiccation resistance response to selection physiology and cor-related responses J Exp Biol 2091837ndash1847
Terhzaz S Cabrero P Robben JH Radford JC Hudson BD Milligan GDow JA Davies SA 2012 Mechanism and function of Drosophilacapa GPCR a desiccation stress-responsive receptor with functionalhomology to human neuromedinU receptor PLoS One 7e29897
Terhzaz S Overend G Sebastian S Dow JA Davies SA 2014 The Dmelanogaster capa-1 neuropeptide activates renal NF-kB signalingPeptides 53218ndash224
Terhzaz S Teets NM Cabrero P Henderson L Ritchie MG Nachman RJDow JA Denlinger DL Davies SA 2015 Insect capa neuropeptidesimpact desiccation and cold tolerance Proc Natl Acad Sci U S A1122882ndash2887
Thorat LJ Gaikwad SM Nath BB 2012 Trehalose as an indicator ofdesiccation stress in Drosophila melanogaster larvae a potentialmarker of anhydrobiosis Biochem Biophys Res Commun419638ndash642
Tobler R Franssen SU Kofler R Orozco-terWengel P Nolte VHermisson J Schl eurootterer C 2014 Massive habitat-specific ge-nomic response in D melanogaster populations during exper-imental evolution in hot and cold environments Mol Biol Evol31364ndash375
Turner TL Bourne EC Von Wettberg EJ Hu TT Nuzhdin SV 2010Population resequencing reveals local adaptation of Arabidopsislyrata to serpentine soils Nat Genet 42260ndash263
Turner TL Miller PM 2012 Investigating natural variation in Drosophilacourtship song by the evolve and resequence approach Genetics191633ndash642
Vermeulen CJ Soslashrensen P Kirilova Gagalova K Loeschcke V 2013Transcriptomic analysis of inbreeding depression in cold-sensitiveDrosophila melanogaster shows upregulation of the immune re-sponse J Evol Biol 261890ndash1902
Wang B Goode J Best J Meltzer J Schilman PE Chen J Garza D ThomasJB Montminy M 2008 The insulin-regulated CREB coactivatorTORC promotes stress resistance in Drosophila Cell Metab7434ndash444
Wang J Kean L Yang J Allan AK Davies SA Herzyk P Dow JA 2004Function-informed transcriptome analysis of Drosophila renaltubule Genome Biol 5R69
Wang K Li M Hakonarson H 2010 Analysing biological pathways ingenome-wide association studies Nat Rev Genet 11843ndash854
Wang L Jia P Wolfinger RD Chen X Zhao Z 2011 Gene set analysis ofgenome-wide association studies methodological issues and per-spectives Genomics 981ndash8
Weber AL Khan GF Magwire MM Tabor CL Mackay TF Anholt RR2012 Genome-wide association analysis of oxidative stress resistancein Drosophila melanogaster PLoS One 7e34745
Weeks AR McKechnie SW Hoffmann AA 2002 Dissecting adaptiveclinal variation markers inversions and sizestress associations inDrosophila melanogaster from a central field population Ecol Lett5756ndash763
Zhu Y Bergland AO Gonzalez J Petrov DA 2012 Empirical validation ofpooled whole genome population re-sequencing in Drosophila mel-anogaster PLoS One 7e41901
15
the largest shifts tended to occur in common intermediatefrequencies (fig 1B)
Inversion Frequencies
To test whether inversions were associated with desiccationresistance allele frequencies were checked at SNPs diagnosticfor the following inversions In(3R)P (Anderson et al 2005Kapun et al 2014) In(2L)t (Andolfatto et al 1999 Kapun et al2014) In(2R)NS In(3L)P In(3R)C In(3R)K and In(3R)Mo(Kapun et al 2014) In(3L)P In(3R)K and In(3R)Mo werenot detected in this population and for all other inversionsinvestigated the control and desiccation-resistant pool fre-quencies did not differ more than 6
Genomic Distribution
We examined these candidates using a comprehensive ap-proach encompassing SNP gene functional ontology andnetwork levels First we examined SNP locations with respectto gene structure and putative effects of SNPs on genomefeatures The 648 SNPs mapped to 382 genes and were largelynoncoding (79 supplementary table S2 SupplementaryMaterial online) Coding variants comprised nearly 15 ofcandidate SNPs largely synonymous substitutions and theremainder nonsynonymous missense variants predicted toresult in amino acid substitution with length preservation(9 and 14 supplementary table S2 SupplementaryMaterial online) Based on proportions of features from allannotated SNPs compared with the candidates intron vari-ants were significantly underrepresented (X2
1 frac14706 Plt 001supplementary table S2 Supplementary Material online)while SNPs were enriched in 50UTRs 50UTR premature startcodon sites and splice regions (X2
1 frac14 820 Plt 001X2
1 frac14 3124 Plt 00001 X21 frac14 640 Plt 005 respectively
supplementary table S2 Supplementary Material online)
GO Functional Enrichment
For the GO analyses we found no overrepresented termsusing the stringent Gowinda gene mode but identified sev-eral categories using SNP mode with a 005 FDR thresholdGO terms were enriched for the broad categories of ldquochro-matinrdquo ldquochromosomerdquo and more specifically ldquodouble-stranded RNA bindingrdquo although significance of this categorywas due solely to nine differentiated SNPs in the DIP1 geneThe term ldquogamma-secretase complexrdquo was also enriched dueto three SNPs in the pen-2 gene
Desiccation Candidate Genes
Chown et al (2011) comprehensively summarized the pri-mary mechanisms underpinning desiccation resistancefrom the critical first phase of stress sensing and behavioralavoidance to the manifold physiological trajectories availableto counter water stress These include water content varia-tion water loss rates (desiccation resistance) and tolerance ofwater loss (dehydration tolerance) We expanded on thisreview to incorporate recent molecular research and summa-rized the primary mechanisms and known candidate genesrelevant to D melanogaster stress sensing and water balance
(table 1 and references therein) Informed by this body ofwork we applied a ldquotop-downrdquo mechanistic approach toscreening desiccation candidate genes in our data (egfrom initial hygrosensing to downstream key physiologicalpathways table 1) We did not observe differentiation inSNPs mapping to the TRP ion channels the key hygrosensingreceptor genes However we did map SNPs to genes involvedin stress sensing specific to the MT fluid secretion signalingpathways including two of the six cAMP and cyclic guanosinemonophosphate (cGMP) signaling phosphodiesterases dncand pde9 as well as klu indicated in NF-kB ortholog signalingin the renal tubules (table 1) A number of the stress-sensingSAPK pathway genes were differentiated (table 1) Key genesinvolved in the insulin signaling pathway implicated in met-abolic homeostasis and water balance were also differenti-ated including the insulin receptor InR and insulin receptorbinding Dok (table 1)
Network Analysis Modeling Genomic Differentiationfor Desiccation in a PPI Context
Given the genomic complexity underpinning desiccation re-sistance we also implemented a higher-order analysis to pro-vide an overview of potential architectures beyond the SNPand gene levels using PPI networks The SNPs mapped to 382genes which were annotated to 307 seed proteins occurring inthe full background D melanogaster PPI network The seedproteins produced a large first-order network with 3324nodes and 51299 edges
To determine if broader functional signatures could beascertained from the complex network we employed func-tional enrichment analyses on the 3324 nodes (table 2 andsupplementary table S3 Supplementary Material online)Strikingly the network was enriched for pathways involvedin gene expression from transcription through RNA process-ing to RNA transport metabolism and decay (table 2)The network analysis also revealed further complexity ofthe stress response with enrichment of proteins involved inregulating innate insect immunity (table 2) where multiplepathways were overrepresented including cytokine interleu-kin and toll-like receptor signaling cascades (table 2)Developmentalgene regulation signaling pathways were sig-nificant in the pathway analysis the KEGG database showedenrichment for Notch and Mitogen-activated protein kinase(MAPK) signaling while insulin and NGF pathways were high-lighted from the Reactome database (table 2)
Common Signatures of Desiccation Resistance acrossStudies
The key finding of this study is the degree of genomic overlapwith our previous mapping of alleles associated with artifi-cially selected desiccation phenotypes (Telonis-Scott et al2012) Although the flies were sampled from comparablesoutheastern Australian locations the study designs were dif-ferent Specifically the earlier study focused on resistancegenerated by artificial selection whereas this study focusedon natural variation in resistance Despite the differences instudy design we identified 45 genes common to both studies
4
Table 1 Summary of Primary Molecular Responses and Candidate Genes for Desiccation Response Mechanisms Identified in the Literature
Mechanism Molecular Function Genes Current Study(F1 wild-caught)
Telonis-Scott et al(2012) (F8 artificial
selection)
Hygrosensing Sensory moisture receptors (antennae legs gut mouthparts)
Temperature-activatedtransient receptor po-tential ionchannelsbcd
TRP ion channels trpl trp Trpg TrpA1 TrpmTrpml pain pyx wtrwnompC iav nan Amo
mdash trpl
Stress sensing Malpighian tubules (principal cells)
Tubule fluid secretion signaling pathways
cAMP signalingefg cAMP activation Dh44-R2 mdash mdash
cGMP signalingefg cGMP activation Capa CapaRcedilNplp1-4 mdash mdash
Receptors Gyc76c CG33958 CG34357 mdash CG34357
Kinases dg1-2 mdash mdash
cAMP + cGMPsignalingefgh
Hydrolyzing phosphodiesterases dnc pde1c pde6 pde9 pde11 dnc pde9 dnc pde1c pde9 pde11
Calcium signalingeg Calcium-sensitive nitric oxidesynthase
Nos mdash Nos
Desiccation specific Diuretic neuropeptide NFkBsignalingijk
Capa CapaR trpl Relish klu klu klu trpl
Anti-diureticneuropeptidel ITP sNPF Tk mdash ITP sNPF
Stress sensing Stress responsive pathway
Stress activatedprotein kinasepathwaymn
MAPKJun kinase
Aop Aplip1 Atf-2 bsk Btk29Acbt Cdc 42 Cka cno CYLDDok egr Gadd45 hep hppyJra kay medo Mkk4 msnMtl Pak puc pyd Rac 1-2Rho1 shark slpr Src42ASrc64B Tak1 Traf6
Aop Dokhepcedilkaymsn Mtlpuc Rac1Src64B
Mtl slpr
Metabolic homeostasis and water balance
Components of insu-lin signaling pathway
Receptorop InR InR mdash
Receptor binding chico IIp1-8 dock Dok poly Dok Poly
Resistance mechanisms Water loss barriers
Cuticularhydrocarbons
Fatty acid synthasesq desatura-sesr transporters oxidativedecarbonylaset
CG3523 CG3524 desat1-2 FatPCyp4g1
CG3523 mdash
Primary hemolymph sugartissue-protectant
Trehalose metabolism Trehalose-phosphatasebu treh-lase activitybu phosphoglyc-erate mutase
Tps1 Treh Pgm CG5171CG5177 CG6262 crc
Treh Pgm mdash
NOTEmdashCandidate genes differentiated in the current study and in Telonis-Scott et al (2012) are shown and are underlined where overlappingaFor brevity several comprehensive reviews are cited and readers are referred to references therein Note that water budgeting via discontinuous gas exchange was not includeddue to a lack of ldquobona fiderdquo molecular candidates and desiccation tolerance due to aquaporins was omitted as genes were not differentiated in either of our studiesbChown et al (2011)cJohnson and Carder (2012)dFowler and Montell (2013)eDay et al (2005)fDavies et al (2012)gDavies et al (2013)hDavies et al (2014)iTerhzaz et al (2012)jTerhzaz et al (2014)kTerhzaz et al (2015)lKahsai et al (2010)mHuang and Tunnacliffe (2004)nHuang and Tunnacliffe (2005)oSeurooderberg et al (2011)pLiu et al (2015)qChung et al (2014)rSinclair et al (2007)rFoley and Telonis-Scott (2011)tQiu et al (2012)uThorat et al (2012)
5
Conserved Genomic Signatures of Drosophila Desiccation Resistance doi101093molbevmsv349 MBE by guest on February 6 2016
httpmbeoxfordjournalsorg
Dow
nloaded from
out of the 382 and 416 genes for the current and 2012 studyrespectively (Fisherrsquos exact test for overlap P lt 2 1016odds ratio frac14 590 supplementary table S4 SupplementaryMaterial online) Bearing gene length bias in mind we exam-ined the average length of genes in both studies as well as thegene overlap to the average gene length in the Drosophilagenome (supplementary fig S3 Supplementary Materialonline) Although there was bias toward longer genes in theoverlap list this was also apparent in the full gene lists fromboth studies
The overlap genes mapped to all of the major chromo-somes and their differentiated variants were mostly noncod-ing SNPs located in introns (supplementary table S4Supplementary Material online) Two SNPs were synonymous(Strn-Mlick coding for a protein kinase and Cht7 chitinbinding supplementary table S4 Supplementary Materialonline) while a 30UTR variant was predicted to putativelyalter protein structure in a nondisruptive manner mostlikely impacting expression andor protein translational effi-ciency (jbug response to stimulus supplementary table S4Supplementary Material online) One nonsynonymous SNPwas predicted to alter protein function as a missense variant(fab1 signaling supplementary table S4 SupplementaryMaterial online) The increase in frequency of the favoreddesiccation alleles compared with the controls ranged from5 to 27 with a median increase of 164
Despite the strong overlap there were important differ-ences between the two studies For example our previouswork revealed evidence of a selective sweep in the 50 pro-moter region of the Dys gene in the artificially selected linesand we confirmed a higher frequency of the selected SNPs inan independent natural association study in resistant fliesfrom Coffs Harbour (Telonis-Scott et al 2012) In this studywe found no differentiation at the Dys locus between ourresistant and random controls However both resistant andcontrol pools had a high frequency of the selected allele de-tected in Telonis-Scott et al (2012) (supplementary table S5Supplementary Material online)
The 45 genes common to both studies were analyzed forbiological function while many genes are obviously pleiotro-pic (ie assigned multiple GO terms) almost half were in-volved in response to stimuli (cellular behavioral andregulatory) including signaling such as Rho protein signaltransduction and cAMP-mediated signaling (table 3) Othercategories of interest include spiracle morphogenesis sensoryorgan development including development of the compoundeye and stem cell and neuroblast differentiation (table 3)
With regard to specific stress responsedesiccation candi-dates (table 1) 4 of the 45 genes overlapped including thecAMPcGMP signaling phosphodiesterases pde9 and dncMAPK pathway genes Mtl and klu indicated in NF-kBorthologue signaling in the renal tubules (table 1)Furthermore there was overlap among mechanistic catego-ries where different genes in the same pathways weredetected For example additional key cAMPcGMP hydrolyz-ing phosphodiesterases pde1c and pde11 were significant inTelonis-Scott et al (2012) as well as slpr (MAPK pathway)and poly (insulin signaling) (table 1)
Table 2 Pathway Enrichment of the GWAS and Telonis-Scott et alFirst-Order PPI Networks Investigated Using FlyMine (FDR P lt 005)
Current Study GWAS Telonis-Scott et al(2012)
Database Pathway Pathway
KEGG Ribosome RibosomeNotch signaling pathway Notch signaling pathwayMAPK signaling pathwayDorso-ventral axis formationProgesterone-mediated oocyte
maturationWnt signaling pathwayEndocytosisProteasome
Reactome Immune system Immune systemAdaptive immune system Adaptive immune systemInnate immune system Innate immune systemSignaling by interleukins Signaling by interleukins(Toll-like receptor cascades) Toll-like receptor
cascadesTCR signaling(B cell receptor signaling)
Gene expression Gene expressionMetabolism of RNA Metabolism of RNANonsense-mediated decay Nonsense-mediated decayTranscriptionmRNA processing(RNA transport)Metabolism of proteins (Metabolism of
proteins)Translation (Translation)Cell cycle Cell cycle
Cell cycle checkpointsRegulation of mitotic cell
cycleMitotic metaphase and
anaphaseG0 and early G1 G0 and early G1
G1 phase(G1S transition)G2M transition(DNA replicationS
phase)DNA repairDouble-strand break
repairSignaling pathways Signaling pathwaysInsulin signaling pathwaySignaling by NGF Signaling by NGF
Signaling by FGFRSignaling by NOTCHSignaling by WntSignaling by EGFRDevelopmentAxon guidanceMembrane traffickingGap junction trafficking
and regulationProgramed cell deathApoptosisHemostasisHemostasis
NOTEmdashKEGG and Reactome pathway databases were queried All significant termsare reported with Reactome terms grouped together under parent terms in italictext (full list of terms available in supplementary table S3 Supplementary Materialonline) Parent terms without brackets are the names of pathways that were sig-nificantly enriched parent terms in parentheses were not listed as significantlyenriched themselves but are reported as a guide to the category contents
6
In addition to the significant gene-level overlap betweenthe two studies we observed significant overlap between thePPI networks constructed from each set of candidate genes(summarized in fig 2) We examined the overlap by compar-ing the first-order networks 1) at the level of node overlap and2) at the level of edge (interaction) overlap Three hundredseven GWAS seeds formed a network of 3324 nodes whileour previous set of candidates mapped to 337 seeds forming anetwork that comprised 3215 nodes One thousand eighthundred twenty-six or ~55 of the nodes overlapped be-tween networks which was significantly higher than the nulldistribution estimated from 1000 simulations (mean simu-lated overlap SD 1304 143 nodes observed overlap atthe 999th percentile fig 2A and C) The overlap at the inter-action level was not significantly higher than expected (meansimulated overlap SD 25704 4024 edges observed over-lap 30686 edges 89th percentile fig 2B and D) Hive plotswere used to visualize the separate networks (supplementaryfig S4A and B Supplementary Material online) as well as theoverlap between the networks (supplementary fig S4CSupplementary Material online)
Given the high degree of shared nodes from distinct setsof protein seeds we compared the networks for biologicalfunction (fig 2E) Both networks produced long lists of sig-nificantly enriched GO terms (supplementary table S3Supplementary Material online) and assessing overlap was
difficult due to the nonindependent nature of GO hierar-chies However a comparison of functional pathway enrich-ment revealed some differences Although both networksshowed enrichment in the immune system pathways theribosome and Notch and NGF signaling pathways as well assome involvement of gene expression and translation path-ways the network constructed from the Telonis-Scott et al(2012) gene list showed extensive involvement of cell cyclepathway proteins as well as significant enrichment for Wntsignaling EGRF signaling DNA double-strand break repairaxon guidance gap junction trafficking and apoptosis path-ways which were not enriched in the GWAS network TheGWAS network was specifically enriched for dorso-ventralaxis formation oocyte maturation and many more geneexpression pathways than the Telonis-Scott et al (2012)gene network (table 2 and supplementary table S3Supplementary Material online)
Discussion
Genomic Signature of Desiccation Resistance fromNatural Standing Genetic Variation
We report the first large-scale genome-wide screen for nat-ural alleles associated with survival of low humidity condi-tions in D melanogaster By combining a powerful naturalassociation study with high-throughput Pool-seq we
Table 3 Overrepresentation of GO Categories for the 45 Core Candidate Genes Overlapping between the Current and Our Previous(Telonis-Scott et al 2012) Genomic Study
Term ID TermSubterms FDR P valuea Genes
Cellularbehavioral response to stimulus defense and signaling
GO0050896 Response to stimulus (cellular response tostimulus regulation of response tostimulus)
0007 Abd-B dnc fz mam nkd slo ct fru kluRbf klg santa-maria jbug fab1CG33275 Mtl Trim9 cv-c Gprk2 Ect4CG43658
GO0048585 Negative regulation of response to stimu-lus (Rho protein signal transductionsignal transduction)
0042 dnc mam slo fru Trim9 cv-c
GO0023052 Signaling (cAMP-mediated signaling cellcommunication)
0031 Amph Pde9 dnc fz mam nkd ct kluRbf santa-maria fab1 CG33275 MtlTrim9 cv-c Gprk2 Ect4 CG43658
Development
GO0048863 Stem cell differentiation 0014 Abd-D mam nkd klu nub
GO0014016 Neuroblast differentiation 0048
GO0045165 Cell fate commitment 0010 Abd-D dnc fz mam nkd ct klu klg nub
GO0035277 Spiracle morphogenesis open trachealsystem
0010 Abd-D ct cv-c
GO0007423 Sensory organ development (eye wingdisc imaginal disc)
0008 fz mam ct klu klg sdk Amph Nckx30cMtl
GO0048736 Appendage development 0020 fz mam ct fru CG33275 Mtl Fili nubcv-c Gprk2 CG43658
GO0007552 Metamorphosis 0001 fz mam ct fru CG33275 Mtl Fili cv-cGprk2 CG43658
Cellular component organization
GO0030030 Cell projection organization 0020 dnc fz ct fru GluClalpha Nckx30c MtlTrim9 nub cv-c
NOTEmdashNote that most genes are assigned to multiple annotations and therefore are shown purely as a guide to putative functionaBonferronindashHochberg FDR and gene length corrected P values using the Flymine v400 database Gene Ontology Enrichment The terms were condensed and trimmed usingREVIGO
7
mapped SNPs associated with desiccation resistance in 500naturally derived genotypes Although Pool-seq has limitedpower to reveal very low frequency alleles to reconstructphase or adequately estimate allele frequencies in smallpools (Lynch et al 2014) the Pool-GWAS approach hasbeen validated by retrieval of known candidate genes in-volved in body pigmentation using similar numbers of ge-notypes as assessed here (Bastide et al 2013)
Although studies in controlled genetic backgrounds reportlarge effects of single genes on desiccation resistance (Table 1)our current and previous screens revealed similarly largenumbers of variants (over 600) mapping to over 300 genesA polygenic basis of desiccation resistance is consistent withquantitative genetic data for the trait (Foley and Telonis-Scott2011) and we also anticipate more complexity from naturalD melanogaster populations with greater standing geneticvariation than inbred laboratory strains However somefalse positive associations are likely in the GWAS due to thelimitations of quantitative trait mapping approaches Pooledevolve-and-resequence experiments routinely yield vast num-bers of candidate SNPs partly due to high levels of standinglinkage disequilibrium (LD) and hitchhiking neutral alleles
(Schleurootterer et al 2015) Large population sizes from specieswith low levels of LD such as D melanogaster somewhat cir-cumvent this issue and a notable feature of our experimentaldesign is our sampling Here we selected the most desicca-tion-resistant phenotypes directly from substantial standinggenetic variation where one generation of recombinationshould largely reflect natural haplotypes
We observed the largest candidate allele frequency differ-ences in loci that had intermediate allele frequencies in thecontrol pool This is in line with population genetics modelswhich predict that evolution from standing genetic variationwill initially be strongest for intermediate frequency alleleswhich can facilitate rapid change (Long et al 2015) Howeverwe also observed a number of low frequency alleles which ifbeneficial can inflate false positives due to long-range LD(Orozco-terWengel et al 2012 Nuzhdin and Turner 2013Tobler et al 2014) Nonetheless spurious LD associationscannot fully account for our candidate list as evidenced bythe desiccation functional candidates and cross-study over-lap Furthermore chromosome inversion dynamics can alsogenerate both short and long-range LD and contribute nu-merous false positive SNP phenotype associations (Hoffmann
Fig 2 Observed versus simulated network overlap between the GWAS first-order PPI network and the first-order PPI network from the Telonis-Scottet al (2012) gene list Overlap is presented in terms of (A) node number and (B) edge number Networks were simulated by resampling from the entireDrosophila melanogaster gene list (r553) as described in the text The histogram shows the distribution of overlap measures for 1000 simulations Theblack line represents the histogram as density and the blue line shows the corresponding normal distribution The red vertical line shows the observedoverlap from the real networks labeled as a percentile of the normal distribution Area-proportional Venn diagrams summarized the extent of overlapfor PPI networks constructed separately from the GWAS candidates and Telonis-Scott et al (2012) candidates (labeled ldquoprevious studyrdquo) (C) Overlapexpressed as the number of nodes Approximately 55 of the nodes overlapped between the two studies which was significantly higher than simulatedexpectations (999 percentile 2B) (D) Overlap expressed in the number of interactions This was not significantly higher than expected by simulation(89th percentile 2B) (E) Overlap at the level of GO biological process terms (directly compared GO terms by name this was not tested statisticallybecause of the complex hierarchical nature of GO terms and is presented for interest)
8
and Weeks 2007 Tobler et al 2014) but we found little as-sociation between the desiccation pool and inversion fre-quency changes consistent with the lack of latitudinalpatterns in desiccation resistanceand traitinversion fre-quency associations in east Australian D melanogaster(Hoffmann et al 2001 Weeks et al 2002)
Cross-Study Comparison Reveals a Core Set ofCandidates for a Complex Trait
Our data contain many candidate genes functions and path-ways for insect desiccation resistance but partitioning genu-ine adaptive loci from experimental noise remains asignificant challenge We therefore focus our efforts on ournovel comparative analysis where we identified 45 robustcandidates for further exploration Common genetic signa-tures are rarely documented in cross-study comparisons ofcomplex traits (Sarup et al 2011 Huang et al 2012 but cfVermeulen et al 2013) and never before for desiccation re-sistance Some degree of overlap between our studies likelyresults from a shared genetic background from similar sam-pling sites highlighting the replicability of these responsesOne limitation of this approach is the overpopulation ofthe ldquocorerdquo list by longer genes This result reflects the genelength bias inherent to GWAS approaches an issue that is notstraightforward to correct in a GWAS let alone in the com-parative framework applied here A focus on the commongenes functionally linked to the desiccation phenotype canprovide a meaningful framework for future research and alsosuggests that some large-effect candidate genes exhibit natu-ral variation contributing to the phenotype (table 1)
Other feasible candidates include genes expressed inthe MTs (Wang et al 2004) or essential for MT develop-ment (St Pierre et al 2014) providing further researchopportunity focused on the primary stress sensing andfluid secretion tissues Genes include one of the mosthighly expressed MT transcripts CG7084 as well as dncNckx30C slo cv-c and nkd Other aspects of stress resis-tance are also implicated CrebB for example is involvedin regulating insulin-regulated stress resistance and ther-mosensory behavior (Wang et al 2008) nkd also impactslarval cuticle development and shares a PDZ signalingdomain also present in the desiccation candidate desi(Kawano et al 2010 St Pierre et al 2014) Finally severalcore candidates were consistently reported in recent stud-ies suggesting generalized roles in stress responses andfitness including transcriptome studies on MT function(Wang et al 2004) inbreeding and cold resistance(Vermeulen et al 2013) oxidative stress (Weber et al2012) and genomic studies exploring age-specific SNPson fitness traits (Durham et al 2014) and genes likelyunder spatial selection in flies from climatically divergenthabitats (Fabian et al 2012)
Several of the enriched GO categories from the core listalso make biological sense including stimulus response(almost half the suite) defense and signaling particularlyin pathways involved in MT stress sensing (Davies et al 20122013 2014) Functional categories enriched for sensory
organ development were highlighted consistent with envi-ronmental sensing categories from desert Drosophila tran-scriptomics (Matzkin and Markow 2009 Rajpurohit et al2013) and stress detection via phototransduction pathwaysin D melanogaster under a range of stressors including des-iccation (Soslashrensen et al 2007)
Evidence for Conserved Signatures at Higher LevelOrganization
Beyond the gene level we observed a striking degree ofsimilarity between this study and that of Telonis-Scottet al (2012) at the network level where significantly over-lapping protein networks were constructed from largely dif-ferent seed protein sets This supports a scenario where theldquobiological information flow from DNA to phenotyperdquo(Civelek and Lusis 2014) contains inherent redundancywhere alternative genetic solutions underlie phenotypesand functions Resistance to perturbation by genetic or en-vironmental variation appears to increase with increasinghierarchical complexity that is phenotype ldquobufferingrdquo (Fuet al 2009) The same functional network in different pop-ulationsgenetic backgrounds can be impacted but theexact genes and SNPs responding to perturbation will notnecessarily overlap In Drosophila different sets of loci fromthe same founders mapped to the same PPI network in thecase of two complex traits chill coma recovery and startleresponse (Huang et al 2012) Furthermore interspecificshared networkspathways from different genes have beendocumented for numerous complex traits (Emilsson et al2008 Aytes et al 2014 Ichihashi et al 2014) providing op-portunities for investigating taxa where individual genefunction is poorly understood
Although the null distribution is not always obviouswhen comparing networks and their functions partly be-cause of reduced sensitivity from missing interactions(Leiserson et al 2013) here the shared network signaturesportray a plausible multifaceted stress response involvingstress sensing rapid gene accessibility and expression Thepredominance of immune system and stress responsepathways are of interest with reference immunitystresspathway ldquocross-talkrdquo particularly in the MTs whereabiotic stressors elicit strong immunity transcriptionalresponses (Davies et al 2012) The higher order architec-tures reflect crucial aspects of rapid gene expression con-trol in response to stress from chromatin organization(Alexander and Lomvardas 2014) to transcription RNAdecay and translation (reviwed in de Nadal et al 2011) InDrosophila variation in transcriptional regulation of indi-vidual genes can underpin divergent desiccation pheno-types (Chung et al 2014) or be functionally inferred frompatterns of many genes elicited during desiccation stress(Rajpurohit et al 2013) Variants with potential to mod-ulate expression at different stages of gene expressioncontrol were evident in our GWAS data alone for exam-ple enrichment for 50UTR and splice variants at the SNPlevel and for chromatin and chromosome structure termsat the functional level
9
Genetic Overlap Versus Genetic differencesmdashCausesand Implications
Despite the common signatures at the gene and higher orderlevels between our studies pervasive differences were ob-served These can arise from differences in standing geneticvariation in the natural populations where different alleliccompositions impact gene-by-environment interactions andepistasis Our network overlap analyses highlight the potentialfor different molecular trajectories to result in similar func-tions (Leiserson et al 2013) consistent with the multiplephysiological pathways potentially available to D melanoga-ster under water stress (Chown et al 2011) Experimentaldesign also presumably contributed to the study differencesStrong directional selection as applied in the 2012 study canincrease the frequency of rare beneficial alleles and likely re-duced overall diversity than did the single-generation GWASdesign Furthermore selection regimes often impose addi-tional selective pressures (eg food deprivation during desic-cation) resulting in correlated responses such as starvationresistance with desiccation resistance and these factors differfrom the laboratory to the field Finally evidence suggests innatural D melanogaster that standing genetic variation canvary extensively with sampling season (Itoh et al 2010Bergland et al 2014) Although we cannot compare levelsof standing variation between our two studies evidence sug-gests that this may have affected the Dys gene where ourGWAS approach revealed no differentiation between thecontrol and resistant pools at this locus but rather detectedthe alleles associated with the selective sweep in Telonis-Scottet al (2012) in both pools at high frequency Whether thisresulted from sampling the later collection after a longdrought (2004ndash2008 wwwbomgovauclimate last accessedJanuary 13 2016) is unclear but this observation highlightsthe relevance of temporal sampling to standing genetic var-iation when attempting to link complex phenotypes togenotypes
Materials and Methods
Natural Population Sampling
Drosophila melanogaster was collected from vineyard waste atKilchorn Winery Romsey Australia in April 2010 Over 1200isofemale lines were established in the laboratory from wild-caught females Species identification was conducted on maleF1 to remove Drosophila simulans contamination and over900 D melanogaster isofemale lines were retained All flieswere maintained in vials of dextrose cornmeal media at25 C and constant light
Desiccation Assay
From each isofemale line ten inseminated F1 females weresorted by aspiration without CO2 and held in vials until 4ndash5days old For the desiccation assay 10 females per line (over9000 flies in total) were screened by placing groups of femalesinto empty vials covered with gauze and randomly assigninglines into 5 large sealed glass tanks containing trays of silicadesiccant (relative humidity [RH] lt10) The tanks werescored for mortality hourly and the final 558 surviving
individuals were selected for the desiccation-resistant pool(558 flies lt6 of the total) The control pool constitutedthe same number of females sampled from over 200 ran-domly chosen isofemale lines that were frozen prior to thedesiccation assay We chose this random control approachinstead of comparing the phenotypic extremes (ie the latemortality tail vs the ldquoearly mortalityrdquo tail) because unfavor-able environmental conditions experienced by the wild-caught dam may inflate environmental variance due to car-ryover effects in the F1 offspring (Schiffer et al 2013)
DNA Extraction and Sequencing
We used pooled whole-genome sequencing (Pool-seq)(Futschik and Schleurootterer 2010) to estimate and compareallele frequency differences between the most desiccation-resistant and randomly sampled flies from a wild populationto associate naturally segregating candidate SNPs with theresponse to low humidity For each treatment (resistantand control) genomic DNA was extracted from 50 femaleheads using the Qiagen DNeasy Blood and Tissue Kit accord-ing to the manufacturerrsquos instructions (Qiagen HildenGermany) Two extractions were combined to create fivepools per treatment which were barcoded separately tocreate technical replicates (total 10 pools of 100 fliesn frac14 1000) The DNA was fragmented using a CovarisS2 ma-chine (Covaris Inc Woburn MA) and libraries were pre-pared from 1 mg genomic DNA (per pool of 100 flies) usingthe Illumina TruSeq Prep module (Illumina San Diego CA)To minimize the effect of variation across lanes the ten poolswere combined in equal concentration (fig 3) and run mul-tiplexed in each lane of a full flow cell of an Illumina HiSeq2000 sequencer Clusters were generated using the TruSeq PECluster Kit v5 on a cBot and sequenced using Illumina TruSeqSBS v3 chemistry (Illumina) Fragmentation library construc-tion and sequencing were performed at the Micromon Next-Gen Sequencing facility (Monash University ClaytonAustralia)
Fig 3 Experimental design See text for further explanation
10
Data Processing and Mapping
Processing was performed on a high-performance computingcluster using the Rubra pipeline system that makes use of theRuffus Python library (Goodstadt 2010) The final variant-calling pipeline was based on a pipeline developed by ClareSloggett (Sloggett et al 2014) and is publicly available athttpsgithubcomgriffinpGWAS_pipeline (last accessedJanuary 13 2016) Raw sequence reads were processed persample per lane with P1 and P2 read files processed sepa-rately initially Trimming was performed with trimmomatic v030 (Lohse et al 2012) Adapter sequences were also re-moved Leading and trailing bases with a quality scorebelow 30 were trimmed and a sliding window (width frac14 10quality threshold frac14 25) was used Reads shorter than 40 bpafter trimming were discarded Quality before and after trim-ming was examined using FastQC v 0101 (Andrews 2014)
Posttrimming reads that remained paired were used asinput into the alignment step Alignment to the D melano-gaster reference genome version r553 was performed withbwa v 062 (Li and Durbin 2009) using the bwa aln commandand the following options No seeding (-l 150) 1 rate ofmissing alignments given 2 uniform base error rate (-n001) a maximum of 2 gap opens per read (-o 2) a maximumof 12 gap extensions per read (-e 12) and long deletionswithin 12 bp of 30 end disallowed (-d 12) The default valueswere used for all other options Output files were then con-verted to sorted bam files using the bwa sampe commandand the ldquoSortSamrdquo option in Picard v 196
At this point the bam files were merged into one file persample using PicardMerge Duplicates were identified withPicard MarkDuplicates The tool RealignerTargetCreator inGATK v 26-5 (DePristo et al 2011) was applied to the tenbam files simultaneously producing a list of intervals contain-ing probable indels around which local realignment would beperformed This was used as input for the IndelRealigner toolwhich was also applied to all ten bam files simultaneously
To investigate the variation among technical replicatesallele frequency was estimated based on read counts foreach technical replicate separately for 100000 randomly cho-sen SNPs across the genome After excluding candidate des-iccation-resistance SNPs and SNPs that may have been falsepositives due to sequencing error (those with mean frequencylt004 across replicates) 1803 SNPs remained For each locuswe calculated the variance in allele frequency estimate amongthe five control replicates and the variance among the fivedesiccation-resistance replicates We compared this with thevariance due to pool category (control vs desiccation resis-tant) using analysis of variance (ANOVA) We also calculatedthe mean pairwise difference in allele frequency estimateamong control and among desiccation-resistance replicatesand the mean concordance correlation coefficient was calcu-lated over all pairwise comparisons within pool category(using the epiccc function in the epiR package Stevensonet al 2015)
The following approach was then taken to identify SNPsdiffering between the desiccation-resistant and controlgroups Based on the technical replicate analysis we merged
the five ldquodesiccation-resistantrdquo and the five random controlsamples into two bam files A pileup file was created withsamtools v 0119 (Li et al 2009) and converted to a ldquosyncrdquo fileusing the mpileup2sync script in Popoolation2 v 1201 (Kofleret al 2011) retaining reads with a mapping quality of at least20 and bases with a minimum quality score of 20 Reads withunmapped mates were also retained (option ndashA in samtoolsmpileup) For this step we used a repeat-masked version ofthe reference genome created with RepeatMasker 403 (Smitet al 2013-2015) and a repeat file containing all annotatedtransposons from D melanogaster including shared ancestralsequences (Flybase release r557) plus all repetitive elementsfrom RepBase release v1904 for D melanogaster Simple re-peats small RNA genes and bacterial insertions were notmasked (options ndashnolow ndashnorna ndashno_is) The rmblastsearch engine was used
Association Analysis
We then performed Fisherrsquos exact test with the Fisher testscript in Popoolation2 (Kofler et al 2011) We set the min-imum count (of the minor allele) frac14 10 minimum coverage(in both populations) frac14 20 and maximum coverage frac1410000 The test was performed for each SNP (slidingwindow off) Although there are numerous approaches tocalculating genome-wide significance thresholds in aPoolseq framework currently there is no consensus on anappropriate standardized statistical method Methods in-clude simulation approaches (Orozco-terWengel et al2012 Bastide et al 2013 Burke et al 2014) correctionusing a null distribution based on estimating genetic driftfrom replicated artificial selection line allele frequency vari-ance (Turner and Miller 2012) use of a simple thresholdbased on a minimum allele frequency difference (Turneret al 2010) and the FDR correction suggested by Storeyand Tibshirani (2003) (used by Fabian et al 2012)
We determined based on our design to calculate the FDRthreshold by comparison with a simulated distribution of nullP values following the approach of Bastide et al (2013) Foreach SNP locus a P value was calculated by simulating thedistribution of read counts between the major and minorallele according to a beta-binomial distribution with meana(a+b) and variance (ab)((a+b)2(a+b+1)) Thismodel allowed for two types of sampling variationStochastic variation in allele sampling across the phenotypesand the background variation in the representation of allelesin each pool caused by library preparation artifacts or sto-chastic variation driven by unequal coverage of the poolsThese sources of variation have also been recognized else-where as being important in interpreting pooled sequenceallele calls and identifying significant differentiation betweenpooled samples (Lynch et al 2014) The value of afrac14 37 waschosen by chi-square test as the best match to the observedP value distribution (afrac14 10 to afrac14 40 were tested) with acorresponding bfrac14 4315 to reflect the coverage depth differ-ence between the random control (487) and desiccation-resistant (568) pools These values were then used tosimulate a null data set 10 the size of the observed data
11
set Because the null P value distribution still did not fit theobserved P value distribution particularly well (supplemen-tary fig S2 Supplementary Material online) these values werenot used for a standard FDR correction Instead a P valuethreshold was calculated based on the lowest 005 of thenull P values similar in approach to Orozco-terWengel et al(2012)
Inversion Analysis
In D melanogaster several cosmopolitan inversion polymor-phisms show cross-continent latitudinal patterns that explaina substantial proportion of clinal variation for many traitsincluding thermal tolerance (reviewed in Hoffmann andWeeks 2007) Patterns of disequilibria generated betweenloci in the vicinity of the rearrangement can obscure signalsof selection (Hoffmann and Weeks 2007) and inversion fre-quencies are routinely considered in genomic studies ofnatural populations sampled from known latitudinal clines(Fabian et al 2012 Kapun et al 2014) We assessedassociations between inversion frequencies and our controland resistant pools at SNPs diagnostic for the following in-versions In3R(Payne) (Anderson et al 2005 Kapun et al2014) In(2L)t (Andolfatto et al 1999 Kapun et al 2014)In(2R)Ns In(3L)P In(3R)C In(3R)K and In(3R)Mo (Kapunet al 2014) Between one and five diagnostic SNPs were in-vestigated for each inversion (depending on availability) Weconsidered an inversion to be associated with desiccationresistance if the frequency of its diagnostic SNP differed sig-nificantly between the control and desiccation-resistant poolsby a Fisherrsquos exact test
Genome Feature Annotation
The list of candidate SNPs was converted to vcf format using acustom R script setting the alternate allele as the allele thatincreased in frequency from the control to the desiccation-resistant pool and setting the reference allele as the otherallele present in cases where both the major and minor allelesdiffered from the reference genome The genomic features inwhich the candidate SNPs occurred were annotated usingsnpEff v40 first creating a database from the D melanogasterv553 genome annotation with the default approach(Cingolani et al 2012) The default annotation settings wereused except for setting the parameter ldquo-ud 0rdquo to annotateSNPs to genes only if they were within the gene body Using2 tests on SNP counts we tested for over- or under-enrichment of feature types by calculating the proportionof lsquoallrsquo features from all SNPs detected in the study and theproportion of ldquocandidaterdquo SNPs identified as differentiated indesiccation-resistant flies (Fabian et al 2012)
GO Analysis
GO analyses was used to associate the candidates with theirbiological functions (Ashburner et al 2000) Previously devel-oped for biological pathway analysis of global gene expressionstudies (Wang et al 2010) typical GO analysis assumes that allgenes are sampled independently with equal probabilityGWAS violate these criteria where SNPs are more likely to
occur in longer genes than in shorter genes resulting in highertype I error rates in GO categories overpopulated with longgenes Gene length as well as overlapping gene biases (Wanget al 2011) were accounted for using Gowinda (Kofler andSchleurootterer 2012) which implements a permutation strategyto reduce false positives Gowinda does not reconstruct LDbetween SNPs but operates in two modes underpinned bytwo extreme assumptions 1) Gene mode (assumes all SNPs ina gene are in LD) and 2) SNP mode (assumes all SNPs in agene are completely independent)
The candidates were examined in both modes but giventhe stringency of gene mode high-resolution SNP modeoutput was utilized for the final analysis with the followingparameters 106 simulations minimum significance frac141 andminimum genes per GO categoryfrac14 3 (to exclude gene-poorcategories) P values were corrected using the Gowinda FDRalgorithm GO annotations from Flybase r557 were applied
Candidate Gene Analysis
In conjunction with the GO analysis we also examinedknown candidate genes associated with survival under lowRH to better characterize possible biological functions of ourGWAS candidates We took a conservative approach to can-didate gene assignment where the SNPs were mapped withinthe entire gene region without up- and downstream exten-sions Genes were curated from FlyBase and the literature(table 1) predominantly using detailed functional experimen-tal evidence as criteria for functional desiccation candidates
Network Analysis
We next obtained a higher level summary of the candidategene list using PPI network analysis The full candidate genelist was used to construct a first-order interaction networkusing all Drosophila PPIs listed in the Drosophila InteractionDatabase v 2014_10 (avoiding interologs) (Murali et al 2011)This ldquofullrdquo PPI network includes manually curated data frompublished literature and experimental data for 9633 proteinsand 93799 interactions For the network construction theigraph R package was used to build a subgraph from the fullPPI network containing the candidate proteins and their first-order interacting proteins (Csardi and Nepusz 2006) Self-connections were removed Functional enrichment analysiswas conducted on the list of network genes using FlyMinev400 (wwwflymineorg last accessed January 13 2016) withthe following parameters for GO enrichment and KEGGReactome pathway enrichment BenjaminindashHochberg FDRcorrection P lt 005 background frac14 all D melanogastergenes in our full PPI network and Ontology (for GO enrich-ment only) frac14 biological process or molecular function
Cross-Study Comparison Overlap with Our PreviousWork
Following on from the comparison between our gene list andthe candidate genes suggested from the literature we quan-titatively evaluated the degree of overlap between the candi-date genes identified in this study with alleles mapped in ourprevious study of flies collected from the same geographic
12
region (Telonis-Scott et al 2012) The ldquooverlaprdquo analysis wasperformed similarly for the GWAS candidate gene and net-work-level approaches First the gene list overlap was testedbetween the two studies using Fisherrsquos exact test where the382 genes identified from the candidate SNPs (this study)were compared with 416 genes identified from candidatesingle feature polymorphisms (Telonis-Scott et al 2012)The contingency table was constructed using the numberof annotated genes in the D melanogaster R553 genomebuilt as an approximation for the total number of genestested in each study and took the following form
where A frac14 gene list from Telonis-Scott et al (2012)n frac14 416 B frac14 gene list from this study n frac14 382 N frac14 totalnumber of genes in D melanogaster reference genome r553n frac14 17106
To annotate genome features in the overlap candidategene list we used the SNP data (this study) which betterresolves alleles than array-based genotyping Due to this lim-itation we did not compare exact genomic coordinates be-tween the studies but considered overlap to occur whendifferentiated alleles were detected in the same gene inboth studies We did however examine the allele frequenciesaround the Dys-RF promoter which was sequenced sepa-rately revealing a selective sweep in the experimental evolu-tion lines (Telonis-Scott et al 2012)
The overlapping candidates were tested for functional en-richment using a basic gene list approach As genes ratherthan SNPs were analyzed here Fly Mine v400 (wwwflymineorg) was applied with the following parameters Gene lengthnormalization BenjaminindashHochberg FDR correction Plt 005Ontology frac14 biological process and default background (allGO-annotated D melanogaster genes)
Finally we constructed a second PPI network from thecandidate genes mapped in Telonis-Scott et al (2012) to ex-amine system-level overlap between the two studies The 416genes mapped to 337 protein seeds and the network wasconstructed as described above The degree of network over-lap between the two studies was tested using simulation Ineach iteration a gene set of the same length as the list fromthis study and a gene set of the same length as the list fromTelonis-Scott et al (2012) were resampled randomly from theentire D melanogaster mapped gene annotation (r553) Eachresampled set was then used to build a first-order network aspreviously described The overlap network between the tworesampled sets was calculated using the graphintersectioncommand from the igraph package and zero-degree nodes(proteins with no interactions) were removed Overlap wasquantified by counting the number of nodes and the numberof edges in this overlap network The observed overlap wascompared with a simulated null distribution for both nodeand edge number where n frac14 1000 simulation iterations
Supplementary MaterialSupplementary figures S1ndashS4 and tables S1ndashS5 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
We thank Robert Good for the fly sampling and JenniferShirriffs and Lea and Alan Rako for outstanding assistancewith laboratory culture and substantial desiccation assaysWe are grateful to Melissa Davis for advice on network build-ing and comparison approaches and to five anonymous re-viewers whose comments improved the manuscript Thiswork was supported by a DECRA fellowship DE120102575and Monash University Technology voucher scheme toMTS an ARC Future Fellowship and Discovery Grants toCMS an ARC Laureate Fellowship to AAH and a Scienceand Industry Endowment Fund grant to CMS and AAH
ReferencesAlexander JM Lomvardas S 2014 Nuclear architecture as an epigenetic
regulator of neural development and function Neuroscience26439ndash50
Anderson AR Hoffmann AA McKechnie SW Umina PA Weeks AR2005 The latitudinal cline in the In(3R)Payne inversion polymor-phism has shifted in the last 20 years in Australian Drosophilamelanogaster populations Mol Ecol 14851ndash858
Andolfatto P Wall JD Kreitman M 1999 Unusual haplotype structureat the proximal breakpoint of In(2L)t in a natural population ofDrosophila melanogaster Genetics 1531297ndash1311
Andrews S 2014 FastQC Available from httpwwwbioinformaticsbbsrcacukprojectsfastqc
Ashburner M Ball CA Blake JA Botstein D Butler H Cherry JM DavisAP Dolinski K Dwight SS Eppig JT et al 2000 Gene ontology toolfor the unification of biology The Gene Ontology Consortium NatGenet 2525ndash29
Aytes A Mitrofanova A Lefebvre C Alvarez Mariano J Castillo-MartinM Zheng T Eastham James A Gopalan A Pienta Kenneth J ShenMM et al 2014 Cross-species regulatory network analysis identifiesa synergistic interaction between FOXM1 and CENPF that drivesprostate cancer malignancy Cancer Cell 25638ndash651
Bastide H Betancourt A Nolte V Tobler R Steuroobe P Futschik ASchleurootterer C 2013 A genome-wide fine-scale map of natural pig-mentation variation in Drosophila melanogaster PLoS Genet9e1003534
Bergland AO Behrman EL OrsquoBrien KR Schmidt PS Petrov DA 2014Genomic evidence of rapid and stable adaptive oscillations overseasonal time scales in Drosophila PLoS Genet 10e1004775
Bonebrake TC Mastrandrea MD 2010 Tolerance adaptation and pre-cipitation changes complicate latitudinal patterns of climate changeimpacts Proc Natl Acad Sci U S A 10712581ndash12586
Bradley TJ 2009 Animal osmoregulation Oxford Oxford UniversityPress
Burke MK Liti G Long AD 2014 Standing genetic variation drivesrepeatable experimental evolution in outcrossing populations ofSaccharomyces cerevisiae Mol Biol Evol 313228ndash3239
Chown SL 2012 Trait-based approaches to conservation physiologyforecasting environmental change risks from the bottom upPhilos Trans R Soc B 3671615ndash1627
Chown SL Nicolson SW 2004 Insect physiological ecology mechanismsand patterns Oxford Oxford University Press
Chown SL Soslashrensen JG Terblanche JS 2011 Water loss in insectsan environmental change perspective J Insect Physiol571070ndash1084
Not in GWAS In GWAS
Not in Telonis-Scott et al (2012) N ndash intersect-AndashB A-intersect
In Telonis-Scott et al (2012) B-intersect Intersect(AB)
13
Chung H Loehlin DW Dufour HD Vaccarro K Millar JG Carroll SB2014 A single gene affects both ecological divergence and matechoice in Drosophila Science 3431148ndash1151
Cingolani P Platts A Wang LL Coon M Nguyen T Wang L Land SJ LuX Ruden DM 2012 A program for annotating and predicting theeffects of single nucleotide polymorphisms SnpEff SNPs in thegenome of Drosophila melanogaster strain w1118 iso-2 iso-3 Fly(Austin) 680ndash92
Civelek M Lusis AJ 2014 Systems genetics approaches to understandcomplex traits Nat Rev Genet 1534ndash48
Clusella-Trullas S Blackburn TM Chown SL 2011 Climatic predictors oftemperature performance curve parameters in ectotherms implycomplex responses to climate change Am Nat 177738ndash751
Csardi G Nepusz T 2006 The igraph software package for complexnetwork research InterJournal Complex Systems 1695 Availablefrom httpigraphorg
Davies SA Cabrero P Overend G Aitchison L Sebastian S Terhzaz SDow JA 2014 Cell signalling mechanisms for insect stress toleranceJ Exp Biol 217119ndash128
Davies SA Cabrero P Povsic M Johnston NR Terhzaz S Dow JA 2013Signaling by Drosophila capa neuropeptides Gen Comp Endocrinol18860ndash66
Davies SA Overend G Sebastian S Cundall M Cabrero P Dow JATerhzaz S 2012 Immune and stress response lsquocross-talkrsquo in theDrosophila Malpighian tubule J Insect Physiol 58488ndash497
Day JP Dow JA Houslay MD Davies SA 2005 Cyclic nucleotide phos-phodiesterases in Drosophila melanogaster Biochem J 388333ndash342
de Nadal E Ammerer G Posas F 2011 Controlling gene expression inresponse to stress Nat Rev Genet 12833ndash845
DePristo MA Banks E Poplin R Garimella KV Maguire JR Hartl CPhilippakis AA del Angel G Rivas MA Hanna M et al 2011 Aframework for variation discovery and genotyping using next-gen-eration DNA sequencing data Nat Genet 43491ndash498
Djawdan M Chippindale AK Rose MR Bradley TJ 1998 Metabolic re-serves and evolved stress resistance in Drosophila melanogasterPhysiol Zool 71584ndash594
Durham MF Magwire MM Stone EA Leips J 2014 Genome-wide anal-ysis in Drosophila reveals age-specific effects of SNPs on fitness traitsNat Commun 54338
Emilsson V Thorleifsson G Zhang B Leonardson AS Zink F Zhu JCarlson S Helgason A Walters GB Gunnarsdottir S et al 2008Genetics of gene expression and its effect on disease Nature452423ndash428
Fabian DK Kapun M Nolte V Kofler R Schmidt PS Schleurootterer C FlattT 2012 Genome-wide patterns of latitudinal differentiation amongpopulations of Drosophila melanogaster from North America MolEcol 214748ndash4769
Foley BR Telonis-Scott M 2011 Quantitative genetic analysis suggestscausal association between cuticular hydrocarbon composition anddesiccation survival in Drosophila melanogaster Heredity 10668ndash77
Folk DG Han C Bradley TJ 2001 Water acquisition and partitioning inDrosophila melanogaster effects of selection for desiccation-resis-tance J Exp Biol 2043323ndash3331
Fowler MA Montell C 2013 Drosophila TRP channels and animal be-havior Life Sci 92394ndash403
Fu J Keurentjes JJB Bouwmeester H America T Verstappen FWA WardJL Beale MH de Vos RCH Dijkstra M Scheltema RA et al 2009System-wide molecular evidence for phenotypic buffering inArabidopsis Nat Genet 41166ndash167
Futschik A Schleurootterer C 2010 The next generation of molecular mar-kers from massively parallel sequencing of pooled DNA samplesGenetics 186207ndash218
Gibbs AG Fukuzato F Matzkin LM 2003 Evolution of water conserva-tion mechanisms in Drosophila J Exp Biol 2061183ndash1192
Gibbs AG Matzkin LM 2001 Evolution of water balance in the genusDrosophila J Exp Biol 2042331ndash2338
Goodstadt L 2010 Ruffus a lightweight Python library for computa-tional pipelines Bioinformatics 262778ndash2779
Hadley NF 1994 Water relations of terrestrial arthropods San Diego(CA) Academic Press
Hoffmann AA Hallas R Sinclair C Mitrovski P 2001 Levels of variationin stress resistance in Drosophila among strains local populationsand geographic regions patterns for desiccation starvation coldresistance and associated traits Evolution 551621ndash1630
Hoffmann AA Parsons PA 1989a An integrated approach to environ-mental stress tolerance and life-history variation desiccation toler-ance in Drosophila Biol J Linn Soc 37117ndash136
Hoffmann AA Parsons PA 1989b Selection for increased desiccationresistance in Drosophila melanogaster additive genetic control andcorrelated responses for other stresses Genetics 122837ndash845
Hoffmann AA Sgro CM 2011 Climate change and evolutionary adap-tation Nature 470479ndash485
Hoffmann AA Weeks AR 2007 Climatic selection on genes and traitsafter a 100 year-old invasion a critical look at the temperate-tropicalclines in Drosophila melanogaster from eastern Australia Genetica129133ndash147
Huang W Richards S Carbone MA Zhu D Anholt RR Ayroles JFDuncan L Jordan KW Lawrence F Magwire MM et al 2012Epistasis dominates the genetic architecture of Drosophila quanti-tative traits Proc Natl Acad Sci U S A 10915553ndash15559
Huang Z Tunnacliffe A 2004 Response of human cells to desiccationcomparison with hyperosmotic stress response J Physiol558181ndash191
Huang Z Tunnacliffe A 2005 Gene induction by desiccation stress inhuman cell cultures FEBS Lett 5794973ndash4977
Ichihashi Y Aguilar-Martınez JA Farhi M Chitwood DH Kumar RMillon LV Peng J Maloof JN Sinha NR 2014 Evolutionary develop-mental transcriptomics reveals a gene network module regulatinginterspecific diversity in plant leaf shape Proc Natl Acad Sci U S A111E2616ndashE2621
Itoh M Nanba N Hasegawa M Inomata N Kondo R Oshima MTakano-Shimizu T 2010 Seasonal changes in the long-distance link-age disequilibrium in Drosophila melanogaster J Hered 10126ndash32
Johnson WA Carder JW 2012 Drosophila nociceptors mediate larvalaversion to dry surface environments utilizing both the painless TRPchannel and the DEGENaC subunit PPK1 PLoS One 7e32878
Kahsai L Kapan N Dircksen H Winther AM Nassel DR 2010 Metabolicstress responses in Drosophila are modulated by brain neurosecre-tory cells that produce multiple neuropeptides PLoS One 5e11480
Kapun M van Schalkwyk H McAllister B Flatt T Schleurootterer C 2014Inference of chromosomal inversion dynamics from Pool-Seq datain natural and laboratory populations of Drosophila melanogasterMol Ecol 231813ndash1827
Kawano T Shimoda M Matsumoto H Ryuda M Tsuzuki S Hayakawa Y2010 Identification of a gene Desiccate contributing to desicca-tion resistance in Drosophila melanogaster J Biol Chem28538889ndash38897
Kellermann V van Heerwaarden B Sgro CM Hoffmann AA 2009Fundamental evolutionary limits in ecological traits driveDrosophila species distributions Science 3251244ndash1246
Kofler R Nolte V Schleurootterer C 2016 The impact of library preparationprotocols on the consistency of allele frequency estimates in Pool-Seq data Mol Ecol Resour 16118ndash122
Kofler R Pandey RV Schleurootterer C 2011 PoPoolation2 identifying dif-ferentiation between populations using sequencing of pooled DNAsamples (Pool-Seq) Bioinformatics 273435ndash3436
Kofler R Schleurootterer C 2012 Gowinda unbiased analysis of gene setenrichment for genome-wide association studies Bioinformatics282084ndash2085
Leiserson MDM Eldridge JV Ramachandran S Raphael BJ 2013Network analysis of GWAS data Curr Opin Genet Dev 23602ndash610
Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 251754ndash1760
Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079
14
Liu Y Luo J Carlsson MA Neuroassel DR 2015 Serotonin and insulin-likepeptides modulate leucokinin-producing neurons that affectfeeding and water homeostasis in Drosophila J Comp Neurol5231840ndash1863
Lohse M Bolger AM Nagel A Fernie AR Lunn JE Stitt M Usadel B 2012RobiNA a user-friendly integrated software solution for RNA-Seq-based transcriptomics Nucleic Acids Res 40W622ndashW627
Long A Liti G Luptak A Tenaillon O 2015 Elucidating the moleculararchitecture of adaptation via evolve and resequence experimentsNat Rev Genet 16567ndash582
Lynch M Bost D Wilson S Maruki T Harrison S 2014 Population-genetic inference from pooled-sequencing data Genome Biol Evol61210ndash1218
Mackay TFC Stone EA Ayroles JF 2009 The genetics of quantitativetraits challenges and prospects Nat Rev Genet 10565ndash577
Marron MT Markow TA Kain KJ Gibbs AG 2003 Effects of starvationand desiccation on energy metabolism in desert and mesicDrosophila J Insect Physiol 49261ndash270
Matzkin LM Markow TA 2009 Transcriptional regulation of metabo-lism associated with the increased desiccation resistance of thecactophilic Drosophila mojavensis Genetics 1821279ndash1288
Murali T Pacifico S Yu J Guest S Roberts GG Finley RL Jr 2011 DroID2011 a comprehensive integrated resource for protein transcrip-tion factor RNA and gene interactions for Drosophila Nucleic AcidsRes 39D736ndashD743
Nuzhdin SV Turner TL 2013 Promises and limitations of hitchhikingmapping Curr Opin Genet Dev 23694ndash699
Orozco-terWengel P Kapun M Nolte V Kofler R Flatt T Schleurootterer C2012 Adaptation of Drosophila to a novel laboratory environmentreveals temporally heterogeneous trajectories of selected alleles MolEcol 214931ndash4941
Picard Picard A set of Java command line tools for manipulating high-throughput seqeuncing (HTS) data and formats Available fromhttpbroadinstitutegithubiopicard
Qiu Y Tittiger C Wicker-Thomas C Le Goff G Young S Wajnberg EFricaux T Taquet N Blomquist GJ Feyereisen R 2012 An insect-specific P450 oxidative decarbonylase for cuticular hydrocarbon bio-synthesis Proc Natl Acad Sci U S A 10914858ndash14863
Rajpurohit S Oliveira CC Etges WJ Gibbs AG 2013 Functional genomicand phenotypic responses to desiccation in natural populations of adesert Drosophilid Mol Ecol 222698ndash2715
Sarup P Soslashrensen JG Kristensen TN Hoffmann AA Loeschcke V PaigeKN Soslashrensen P 2011 Candidate genes detected in transcriptomestudies are strongly dependent on genetic background PLoS One6e15644
Schiffer M Hangartner S Hoffmann AA 2013 Assessing the relativeimportance of environmental effects carry-over effects and speciesdifferences in thermal stress resistance a comparison ofDrosophilids across field and laboratory generations J Exp Biol2163790ndash3798
Schleurootterer C Kofler R Versace E Tobler R Franssen SU 2015Combining experimental evolution with next-generation sequen-cing A powerful tool to study adaptation from standing geneticvariation Heredity 114431ndash440
Sinclair BJ Gibbs AG Roberts SP 2007 Gene transcription during ex-posure to and recovery from cold and desiccation stress inDrosophila melanogaster Insect Mol Biol 16435ndash443
Sloggett C Wakefield M Philip G Pope B 2014 Rubramdashflexible distrib-uted pipelines figshare Available from httpdxdoiorg106084m9figshare895626
Smit AFA Hubley R Green P 2013-2015 RepeatMasker Open-403Available from httpwwwrepeatmaskerorg
Seurooderberg JA Birse RT Neuroassel DR 2011 Insulin production and signalingin renal tubules of Drosophila is under control of tachykinin-relatedpeptide and regulates stress resistance PLoS One 6e19866
Soslashrensen JG Nielsen MM Loeschcke V 2007 Gene expression profileanalysis of Drosophila melanogaster selected for resistance to envi-ronmental stressors J Evol Biol 201624ndash1636
St Pierre SE Ponting L Stefancsik R McQuilton P 2014 FlyBase102mdashadvanced approaches to interrogating FlyBase Nucleic AcidsRes 42D780ndashD788
Stevenson M Nunes T Heuser C Marshall J Sanchez J Thornton RReiczigel J Robison-Cox J Sebastiani P Solymos P et al 2015 epiRTools for the analysis of epidemiological data R package version09-62 Available from httpCRANR-projectorgpackagefrac14epiR
Storey JD Tibshirani R 2003 Statistical significance for genomewidestudies Proc Natl Acad Sci U S A 1009440ndash9445
Telonis-Scott M Gane M DeGaris S Sgro CM Hoffmann AA 2012 Highresolution mapping of candidate alleles for desiccation resistancein Drosophila melanogaster under selection Mol Biol Evol291335ndash1351
Telonis-Scott M Guthridge KM Hoffmann AA 2006 A new set oflaboratory-selected Drosophila melanogaster lines for the analysisof desiccation resistance response to selection physiology and cor-related responses J Exp Biol 2091837ndash1847
Terhzaz S Cabrero P Robben JH Radford JC Hudson BD Milligan GDow JA Davies SA 2012 Mechanism and function of Drosophilacapa GPCR a desiccation stress-responsive receptor with functionalhomology to human neuromedinU receptor PLoS One 7e29897
Terhzaz S Overend G Sebastian S Dow JA Davies SA 2014 The Dmelanogaster capa-1 neuropeptide activates renal NF-kB signalingPeptides 53218ndash224
Terhzaz S Teets NM Cabrero P Henderson L Ritchie MG Nachman RJDow JA Denlinger DL Davies SA 2015 Insect capa neuropeptidesimpact desiccation and cold tolerance Proc Natl Acad Sci U S A1122882ndash2887
Thorat LJ Gaikwad SM Nath BB 2012 Trehalose as an indicator ofdesiccation stress in Drosophila melanogaster larvae a potentialmarker of anhydrobiosis Biochem Biophys Res Commun419638ndash642
Tobler R Franssen SU Kofler R Orozco-terWengel P Nolte VHermisson J Schl eurootterer C 2014 Massive habitat-specific ge-nomic response in D melanogaster populations during exper-imental evolution in hot and cold environments Mol Biol Evol31364ndash375
Turner TL Bourne EC Von Wettberg EJ Hu TT Nuzhdin SV 2010Population resequencing reveals local adaptation of Arabidopsislyrata to serpentine soils Nat Genet 42260ndash263
Turner TL Miller PM 2012 Investigating natural variation in Drosophilacourtship song by the evolve and resequence approach Genetics191633ndash642
Vermeulen CJ Soslashrensen P Kirilova Gagalova K Loeschcke V 2013Transcriptomic analysis of inbreeding depression in cold-sensitiveDrosophila melanogaster shows upregulation of the immune re-sponse J Evol Biol 261890ndash1902
Wang B Goode J Best J Meltzer J Schilman PE Chen J Garza D ThomasJB Montminy M 2008 The insulin-regulated CREB coactivatorTORC promotes stress resistance in Drosophila Cell Metab7434ndash444
Wang J Kean L Yang J Allan AK Davies SA Herzyk P Dow JA 2004Function-informed transcriptome analysis of Drosophila renaltubule Genome Biol 5R69
Wang K Li M Hakonarson H 2010 Analysing biological pathways ingenome-wide association studies Nat Rev Genet 11843ndash854
Wang L Jia P Wolfinger RD Chen X Zhao Z 2011 Gene set analysis ofgenome-wide association studies methodological issues and per-spectives Genomics 981ndash8
Weber AL Khan GF Magwire MM Tabor CL Mackay TF Anholt RR2012 Genome-wide association analysis of oxidative stress resistancein Drosophila melanogaster PLoS One 7e34745
Weeks AR McKechnie SW Hoffmann AA 2002 Dissecting adaptiveclinal variation markers inversions and sizestress associations inDrosophila melanogaster from a central field population Ecol Lett5756ndash763
Zhu Y Bergland AO Gonzalez J Petrov DA 2012 Empirical validation ofpooled whole genome population re-sequencing in Drosophila mel-anogaster PLoS One 7e41901
15
Table 1 Summary of Primary Molecular Responses and Candidate Genes for Desiccation Response Mechanisms Identified in the Literature
Mechanism Molecular Function Genes Current Study(F1 wild-caught)
Telonis-Scott et al(2012) (F8 artificial
selection)
Hygrosensing Sensory moisture receptors (antennae legs gut mouthparts)
Temperature-activatedtransient receptor po-tential ionchannelsbcd
TRP ion channels trpl trp Trpg TrpA1 TrpmTrpml pain pyx wtrwnompC iav nan Amo
mdash trpl
Stress sensing Malpighian tubules (principal cells)
Tubule fluid secretion signaling pathways
cAMP signalingefg cAMP activation Dh44-R2 mdash mdash
cGMP signalingefg cGMP activation Capa CapaRcedilNplp1-4 mdash mdash
Receptors Gyc76c CG33958 CG34357 mdash CG34357
Kinases dg1-2 mdash mdash
cAMP + cGMPsignalingefgh
Hydrolyzing phosphodiesterases dnc pde1c pde6 pde9 pde11 dnc pde9 dnc pde1c pde9 pde11
Calcium signalingeg Calcium-sensitive nitric oxidesynthase
Nos mdash Nos
Desiccation specific Diuretic neuropeptide NFkBsignalingijk
Capa CapaR trpl Relish klu klu klu trpl
Anti-diureticneuropeptidel ITP sNPF Tk mdash ITP sNPF
Stress sensing Stress responsive pathway
Stress activatedprotein kinasepathwaymn
MAPKJun kinase
Aop Aplip1 Atf-2 bsk Btk29Acbt Cdc 42 Cka cno CYLDDok egr Gadd45 hep hppyJra kay medo Mkk4 msnMtl Pak puc pyd Rac 1-2Rho1 shark slpr Src42ASrc64B Tak1 Traf6
Aop Dokhepcedilkaymsn Mtlpuc Rac1Src64B
Mtl slpr
Metabolic homeostasis and water balance
Components of insu-lin signaling pathway
Receptorop InR InR mdash
Receptor binding chico IIp1-8 dock Dok poly Dok Poly
Resistance mechanisms Water loss barriers
Cuticularhydrocarbons
Fatty acid synthasesq desatura-sesr transporters oxidativedecarbonylaset
CG3523 CG3524 desat1-2 FatPCyp4g1
CG3523 mdash
Primary hemolymph sugartissue-protectant
Trehalose metabolism Trehalose-phosphatasebu treh-lase activitybu phosphoglyc-erate mutase
Tps1 Treh Pgm CG5171CG5177 CG6262 crc
Treh Pgm mdash
NOTEmdashCandidate genes differentiated in the current study and in Telonis-Scott et al (2012) are shown and are underlined where overlappingaFor brevity several comprehensive reviews are cited and readers are referred to references therein Note that water budgeting via discontinuous gas exchange was not includeddue to a lack of ldquobona fiderdquo molecular candidates and desiccation tolerance due to aquaporins was omitted as genes were not differentiated in either of our studiesbChown et al (2011)cJohnson and Carder (2012)dFowler and Montell (2013)eDay et al (2005)fDavies et al (2012)gDavies et al (2013)hDavies et al (2014)iTerhzaz et al (2012)jTerhzaz et al (2014)kTerhzaz et al (2015)lKahsai et al (2010)mHuang and Tunnacliffe (2004)nHuang and Tunnacliffe (2005)oSeurooderberg et al (2011)pLiu et al (2015)qChung et al (2014)rSinclair et al (2007)rFoley and Telonis-Scott (2011)tQiu et al (2012)uThorat et al (2012)
5
Conserved Genomic Signatures of Drosophila Desiccation Resistance doi101093molbevmsv349 MBE by guest on February 6 2016
httpmbeoxfordjournalsorg
Dow
nloaded from
out of the 382 and 416 genes for the current and 2012 studyrespectively (Fisherrsquos exact test for overlap P lt 2 1016odds ratio frac14 590 supplementary table S4 SupplementaryMaterial online) Bearing gene length bias in mind we exam-ined the average length of genes in both studies as well as thegene overlap to the average gene length in the Drosophilagenome (supplementary fig S3 Supplementary Materialonline) Although there was bias toward longer genes in theoverlap list this was also apparent in the full gene lists fromboth studies
The overlap genes mapped to all of the major chromo-somes and their differentiated variants were mostly noncod-ing SNPs located in introns (supplementary table S4Supplementary Material online) Two SNPs were synonymous(Strn-Mlick coding for a protein kinase and Cht7 chitinbinding supplementary table S4 Supplementary Materialonline) while a 30UTR variant was predicted to putativelyalter protein structure in a nondisruptive manner mostlikely impacting expression andor protein translational effi-ciency (jbug response to stimulus supplementary table S4Supplementary Material online) One nonsynonymous SNPwas predicted to alter protein function as a missense variant(fab1 signaling supplementary table S4 SupplementaryMaterial online) The increase in frequency of the favoreddesiccation alleles compared with the controls ranged from5 to 27 with a median increase of 164
Despite the strong overlap there were important differ-ences between the two studies For example our previouswork revealed evidence of a selective sweep in the 50 pro-moter region of the Dys gene in the artificially selected linesand we confirmed a higher frequency of the selected SNPs inan independent natural association study in resistant fliesfrom Coffs Harbour (Telonis-Scott et al 2012) In this studywe found no differentiation at the Dys locus between ourresistant and random controls However both resistant andcontrol pools had a high frequency of the selected allele de-tected in Telonis-Scott et al (2012) (supplementary table S5Supplementary Material online)
The 45 genes common to both studies were analyzed forbiological function while many genes are obviously pleiotro-pic (ie assigned multiple GO terms) almost half were in-volved in response to stimuli (cellular behavioral andregulatory) including signaling such as Rho protein signaltransduction and cAMP-mediated signaling (table 3) Othercategories of interest include spiracle morphogenesis sensoryorgan development including development of the compoundeye and stem cell and neuroblast differentiation (table 3)
With regard to specific stress responsedesiccation candi-dates (table 1) 4 of the 45 genes overlapped including thecAMPcGMP signaling phosphodiesterases pde9 and dncMAPK pathway genes Mtl and klu indicated in NF-kBorthologue signaling in the renal tubules (table 1)Furthermore there was overlap among mechanistic catego-ries where different genes in the same pathways weredetected For example additional key cAMPcGMP hydrolyz-ing phosphodiesterases pde1c and pde11 were significant inTelonis-Scott et al (2012) as well as slpr (MAPK pathway)and poly (insulin signaling) (table 1)
Table 2 Pathway Enrichment of the GWAS and Telonis-Scott et alFirst-Order PPI Networks Investigated Using FlyMine (FDR P lt 005)
Current Study GWAS Telonis-Scott et al(2012)
Database Pathway Pathway
KEGG Ribosome RibosomeNotch signaling pathway Notch signaling pathwayMAPK signaling pathwayDorso-ventral axis formationProgesterone-mediated oocyte
maturationWnt signaling pathwayEndocytosisProteasome
Reactome Immune system Immune systemAdaptive immune system Adaptive immune systemInnate immune system Innate immune systemSignaling by interleukins Signaling by interleukins(Toll-like receptor cascades) Toll-like receptor
cascadesTCR signaling(B cell receptor signaling)
Gene expression Gene expressionMetabolism of RNA Metabolism of RNANonsense-mediated decay Nonsense-mediated decayTranscriptionmRNA processing(RNA transport)Metabolism of proteins (Metabolism of
proteins)Translation (Translation)Cell cycle Cell cycle
Cell cycle checkpointsRegulation of mitotic cell
cycleMitotic metaphase and
anaphaseG0 and early G1 G0 and early G1
G1 phase(G1S transition)G2M transition(DNA replicationS
phase)DNA repairDouble-strand break
repairSignaling pathways Signaling pathwaysInsulin signaling pathwaySignaling by NGF Signaling by NGF
Signaling by FGFRSignaling by NOTCHSignaling by WntSignaling by EGFRDevelopmentAxon guidanceMembrane traffickingGap junction trafficking
and regulationProgramed cell deathApoptosisHemostasisHemostasis
NOTEmdashKEGG and Reactome pathway databases were queried All significant termsare reported with Reactome terms grouped together under parent terms in italictext (full list of terms available in supplementary table S3 Supplementary Materialonline) Parent terms without brackets are the names of pathways that were sig-nificantly enriched parent terms in parentheses were not listed as significantlyenriched themselves but are reported as a guide to the category contents
6
In addition to the significant gene-level overlap betweenthe two studies we observed significant overlap between thePPI networks constructed from each set of candidate genes(summarized in fig 2) We examined the overlap by compar-ing the first-order networks 1) at the level of node overlap and2) at the level of edge (interaction) overlap Three hundredseven GWAS seeds formed a network of 3324 nodes whileour previous set of candidates mapped to 337 seeds forming anetwork that comprised 3215 nodes One thousand eighthundred twenty-six or ~55 of the nodes overlapped be-tween networks which was significantly higher than the nulldistribution estimated from 1000 simulations (mean simu-lated overlap SD 1304 143 nodes observed overlap atthe 999th percentile fig 2A and C) The overlap at the inter-action level was not significantly higher than expected (meansimulated overlap SD 25704 4024 edges observed over-lap 30686 edges 89th percentile fig 2B and D) Hive plotswere used to visualize the separate networks (supplementaryfig S4A and B Supplementary Material online) as well as theoverlap between the networks (supplementary fig S4CSupplementary Material online)
Given the high degree of shared nodes from distinct setsof protein seeds we compared the networks for biologicalfunction (fig 2E) Both networks produced long lists of sig-nificantly enriched GO terms (supplementary table S3Supplementary Material online) and assessing overlap was
difficult due to the nonindependent nature of GO hierar-chies However a comparison of functional pathway enrich-ment revealed some differences Although both networksshowed enrichment in the immune system pathways theribosome and Notch and NGF signaling pathways as well assome involvement of gene expression and translation path-ways the network constructed from the Telonis-Scott et al(2012) gene list showed extensive involvement of cell cyclepathway proteins as well as significant enrichment for Wntsignaling EGRF signaling DNA double-strand break repairaxon guidance gap junction trafficking and apoptosis path-ways which were not enriched in the GWAS network TheGWAS network was specifically enriched for dorso-ventralaxis formation oocyte maturation and many more geneexpression pathways than the Telonis-Scott et al (2012)gene network (table 2 and supplementary table S3Supplementary Material online)
Discussion
Genomic Signature of Desiccation Resistance fromNatural Standing Genetic Variation
We report the first large-scale genome-wide screen for nat-ural alleles associated with survival of low humidity condi-tions in D melanogaster By combining a powerful naturalassociation study with high-throughput Pool-seq we
Table 3 Overrepresentation of GO Categories for the 45 Core Candidate Genes Overlapping between the Current and Our Previous(Telonis-Scott et al 2012) Genomic Study
Term ID TermSubterms FDR P valuea Genes
Cellularbehavioral response to stimulus defense and signaling
GO0050896 Response to stimulus (cellular response tostimulus regulation of response tostimulus)
0007 Abd-B dnc fz mam nkd slo ct fru kluRbf klg santa-maria jbug fab1CG33275 Mtl Trim9 cv-c Gprk2 Ect4CG43658
GO0048585 Negative regulation of response to stimu-lus (Rho protein signal transductionsignal transduction)
0042 dnc mam slo fru Trim9 cv-c
GO0023052 Signaling (cAMP-mediated signaling cellcommunication)
0031 Amph Pde9 dnc fz mam nkd ct kluRbf santa-maria fab1 CG33275 MtlTrim9 cv-c Gprk2 Ect4 CG43658
Development
GO0048863 Stem cell differentiation 0014 Abd-D mam nkd klu nub
GO0014016 Neuroblast differentiation 0048
GO0045165 Cell fate commitment 0010 Abd-D dnc fz mam nkd ct klu klg nub
GO0035277 Spiracle morphogenesis open trachealsystem
0010 Abd-D ct cv-c
GO0007423 Sensory organ development (eye wingdisc imaginal disc)
0008 fz mam ct klu klg sdk Amph Nckx30cMtl
GO0048736 Appendage development 0020 fz mam ct fru CG33275 Mtl Fili nubcv-c Gprk2 CG43658
GO0007552 Metamorphosis 0001 fz mam ct fru CG33275 Mtl Fili cv-cGprk2 CG43658
Cellular component organization
GO0030030 Cell projection organization 0020 dnc fz ct fru GluClalpha Nckx30c MtlTrim9 nub cv-c
NOTEmdashNote that most genes are assigned to multiple annotations and therefore are shown purely as a guide to putative functionaBonferronindashHochberg FDR and gene length corrected P values using the Flymine v400 database Gene Ontology Enrichment The terms were condensed and trimmed usingREVIGO
7
mapped SNPs associated with desiccation resistance in 500naturally derived genotypes Although Pool-seq has limitedpower to reveal very low frequency alleles to reconstructphase or adequately estimate allele frequencies in smallpools (Lynch et al 2014) the Pool-GWAS approach hasbeen validated by retrieval of known candidate genes in-volved in body pigmentation using similar numbers of ge-notypes as assessed here (Bastide et al 2013)
Although studies in controlled genetic backgrounds reportlarge effects of single genes on desiccation resistance (Table 1)our current and previous screens revealed similarly largenumbers of variants (over 600) mapping to over 300 genesA polygenic basis of desiccation resistance is consistent withquantitative genetic data for the trait (Foley and Telonis-Scott2011) and we also anticipate more complexity from naturalD melanogaster populations with greater standing geneticvariation than inbred laboratory strains However somefalse positive associations are likely in the GWAS due to thelimitations of quantitative trait mapping approaches Pooledevolve-and-resequence experiments routinely yield vast num-bers of candidate SNPs partly due to high levels of standinglinkage disequilibrium (LD) and hitchhiking neutral alleles
(Schleurootterer et al 2015) Large population sizes from specieswith low levels of LD such as D melanogaster somewhat cir-cumvent this issue and a notable feature of our experimentaldesign is our sampling Here we selected the most desicca-tion-resistant phenotypes directly from substantial standinggenetic variation where one generation of recombinationshould largely reflect natural haplotypes
We observed the largest candidate allele frequency differ-ences in loci that had intermediate allele frequencies in thecontrol pool This is in line with population genetics modelswhich predict that evolution from standing genetic variationwill initially be strongest for intermediate frequency alleleswhich can facilitate rapid change (Long et al 2015) Howeverwe also observed a number of low frequency alleles which ifbeneficial can inflate false positives due to long-range LD(Orozco-terWengel et al 2012 Nuzhdin and Turner 2013Tobler et al 2014) Nonetheless spurious LD associationscannot fully account for our candidate list as evidenced bythe desiccation functional candidates and cross-study over-lap Furthermore chromosome inversion dynamics can alsogenerate both short and long-range LD and contribute nu-merous false positive SNP phenotype associations (Hoffmann
Fig 2 Observed versus simulated network overlap between the GWAS first-order PPI network and the first-order PPI network from the Telonis-Scottet al (2012) gene list Overlap is presented in terms of (A) node number and (B) edge number Networks were simulated by resampling from the entireDrosophila melanogaster gene list (r553) as described in the text The histogram shows the distribution of overlap measures for 1000 simulations Theblack line represents the histogram as density and the blue line shows the corresponding normal distribution The red vertical line shows the observedoverlap from the real networks labeled as a percentile of the normal distribution Area-proportional Venn diagrams summarized the extent of overlapfor PPI networks constructed separately from the GWAS candidates and Telonis-Scott et al (2012) candidates (labeled ldquoprevious studyrdquo) (C) Overlapexpressed as the number of nodes Approximately 55 of the nodes overlapped between the two studies which was significantly higher than simulatedexpectations (999 percentile 2B) (D) Overlap expressed in the number of interactions This was not significantly higher than expected by simulation(89th percentile 2B) (E) Overlap at the level of GO biological process terms (directly compared GO terms by name this was not tested statisticallybecause of the complex hierarchical nature of GO terms and is presented for interest)
8
and Weeks 2007 Tobler et al 2014) but we found little as-sociation between the desiccation pool and inversion fre-quency changes consistent with the lack of latitudinalpatterns in desiccation resistanceand traitinversion fre-quency associations in east Australian D melanogaster(Hoffmann et al 2001 Weeks et al 2002)
Cross-Study Comparison Reveals a Core Set ofCandidates for a Complex Trait
Our data contain many candidate genes functions and path-ways for insect desiccation resistance but partitioning genu-ine adaptive loci from experimental noise remains asignificant challenge We therefore focus our efforts on ournovel comparative analysis where we identified 45 robustcandidates for further exploration Common genetic signa-tures are rarely documented in cross-study comparisons ofcomplex traits (Sarup et al 2011 Huang et al 2012 but cfVermeulen et al 2013) and never before for desiccation re-sistance Some degree of overlap between our studies likelyresults from a shared genetic background from similar sam-pling sites highlighting the replicability of these responsesOne limitation of this approach is the overpopulation ofthe ldquocorerdquo list by longer genes This result reflects the genelength bias inherent to GWAS approaches an issue that is notstraightforward to correct in a GWAS let alone in the com-parative framework applied here A focus on the commongenes functionally linked to the desiccation phenotype canprovide a meaningful framework for future research and alsosuggests that some large-effect candidate genes exhibit natu-ral variation contributing to the phenotype (table 1)
Other feasible candidates include genes expressed inthe MTs (Wang et al 2004) or essential for MT develop-ment (St Pierre et al 2014) providing further researchopportunity focused on the primary stress sensing andfluid secretion tissues Genes include one of the mosthighly expressed MT transcripts CG7084 as well as dncNckx30C slo cv-c and nkd Other aspects of stress resis-tance are also implicated CrebB for example is involvedin regulating insulin-regulated stress resistance and ther-mosensory behavior (Wang et al 2008) nkd also impactslarval cuticle development and shares a PDZ signalingdomain also present in the desiccation candidate desi(Kawano et al 2010 St Pierre et al 2014) Finally severalcore candidates were consistently reported in recent stud-ies suggesting generalized roles in stress responses andfitness including transcriptome studies on MT function(Wang et al 2004) inbreeding and cold resistance(Vermeulen et al 2013) oxidative stress (Weber et al2012) and genomic studies exploring age-specific SNPson fitness traits (Durham et al 2014) and genes likelyunder spatial selection in flies from climatically divergenthabitats (Fabian et al 2012)
Several of the enriched GO categories from the core listalso make biological sense including stimulus response(almost half the suite) defense and signaling particularlyin pathways involved in MT stress sensing (Davies et al 20122013 2014) Functional categories enriched for sensory
organ development were highlighted consistent with envi-ronmental sensing categories from desert Drosophila tran-scriptomics (Matzkin and Markow 2009 Rajpurohit et al2013) and stress detection via phototransduction pathwaysin D melanogaster under a range of stressors including des-iccation (Soslashrensen et al 2007)
Evidence for Conserved Signatures at Higher LevelOrganization
Beyond the gene level we observed a striking degree ofsimilarity between this study and that of Telonis-Scottet al (2012) at the network level where significantly over-lapping protein networks were constructed from largely dif-ferent seed protein sets This supports a scenario where theldquobiological information flow from DNA to phenotyperdquo(Civelek and Lusis 2014) contains inherent redundancywhere alternative genetic solutions underlie phenotypesand functions Resistance to perturbation by genetic or en-vironmental variation appears to increase with increasinghierarchical complexity that is phenotype ldquobufferingrdquo (Fuet al 2009) The same functional network in different pop-ulationsgenetic backgrounds can be impacted but theexact genes and SNPs responding to perturbation will notnecessarily overlap In Drosophila different sets of loci fromthe same founders mapped to the same PPI network in thecase of two complex traits chill coma recovery and startleresponse (Huang et al 2012) Furthermore interspecificshared networkspathways from different genes have beendocumented for numerous complex traits (Emilsson et al2008 Aytes et al 2014 Ichihashi et al 2014) providing op-portunities for investigating taxa where individual genefunction is poorly understood
Although the null distribution is not always obviouswhen comparing networks and their functions partly be-cause of reduced sensitivity from missing interactions(Leiserson et al 2013) here the shared network signaturesportray a plausible multifaceted stress response involvingstress sensing rapid gene accessibility and expression Thepredominance of immune system and stress responsepathways are of interest with reference immunitystresspathway ldquocross-talkrdquo particularly in the MTs whereabiotic stressors elicit strong immunity transcriptionalresponses (Davies et al 2012) The higher order architec-tures reflect crucial aspects of rapid gene expression con-trol in response to stress from chromatin organization(Alexander and Lomvardas 2014) to transcription RNAdecay and translation (reviwed in de Nadal et al 2011) InDrosophila variation in transcriptional regulation of indi-vidual genes can underpin divergent desiccation pheno-types (Chung et al 2014) or be functionally inferred frompatterns of many genes elicited during desiccation stress(Rajpurohit et al 2013) Variants with potential to mod-ulate expression at different stages of gene expressioncontrol were evident in our GWAS data alone for exam-ple enrichment for 50UTR and splice variants at the SNPlevel and for chromatin and chromosome structure termsat the functional level
9
Genetic Overlap Versus Genetic differencesmdashCausesand Implications
Despite the common signatures at the gene and higher orderlevels between our studies pervasive differences were ob-served These can arise from differences in standing geneticvariation in the natural populations where different alleliccompositions impact gene-by-environment interactions andepistasis Our network overlap analyses highlight the potentialfor different molecular trajectories to result in similar func-tions (Leiserson et al 2013) consistent with the multiplephysiological pathways potentially available to D melanoga-ster under water stress (Chown et al 2011) Experimentaldesign also presumably contributed to the study differencesStrong directional selection as applied in the 2012 study canincrease the frequency of rare beneficial alleles and likely re-duced overall diversity than did the single-generation GWASdesign Furthermore selection regimes often impose addi-tional selective pressures (eg food deprivation during desic-cation) resulting in correlated responses such as starvationresistance with desiccation resistance and these factors differfrom the laboratory to the field Finally evidence suggests innatural D melanogaster that standing genetic variation canvary extensively with sampling season (Itoh et al 2010Bergland et al 2014) Although we cannot compare levelsof standing variation between our two studies evidence sug-gests that this may have affected the Dys gene where ourGWAS approach revealed no differentiation between thecontrol and resistant pools at this locus but rather detectedthe alleles associated with the selective sweep in Telonis-Scottet al (2012) in both pools at high frequency Whether thisresulted from sampling the later collection after a longdrought (2004ndash2008 wwwbomgovauclimate last accessedJanuary 13 2016) is unclear but this observation highlightsthe relevance of temporal sampling to standing genetic var-iation when attempting to link complex phenotypes togenotypes
Materials and Methods
Natural Population Sampling
Drosophila melanogaster was collected from vineyard waste atKilchorn Winery Romsey Australia in April 2010 Over 1200isofemale lines were established in the laboratory from wild-caught females Species identification was conducted on maleF1 to remove Drosophila simulans contamination and over900 D melanogaster isofemale lines were retained All flieswere maintained in vials of dextrose cornmeal media at25 C and constant light
Desiccation Assay
From each isofemale line ten inseminated F1 females weresorted by aspiration without CO2 and held in vials until 4ndash5days old For the desiccation assay 10 females per line (over9000 flies in total) were screened by placing groups of femalesinto empty vials covered with gauze and randomly assigninglines into 5 large sealed glass tanks containing trays of silicadesiccant (relative humidity [RH] lt10) The tanks werescored for mortality hourly and the final 558 surviving
individuals were selected for the desiccation-resistant pool(558 flies lt6 of the total) The control pool constitutedthe same number of females sampled from over 200 ran-domly chosen isofemale lines that were frozen prior to thedesiccation assay We chose this random control approachinstead of comparing the phenotypic extremes (ie the latemortality tail vs the ldquoearly mortalityrdquo tail) because unfavor-able environmental conditions experienced by the wild-caught dam may inflate environmental variance due to car-ryover effects in the F1 offspring (Schiffer et al 2013)
DNA Extraction and Sequencing
We used pooled whole-genome sequencing (Pool-seq)(Futschik and Schleurootterer 2010) to estimate and compareallele frequency differences between the most desiccation-resistant and randomly sampled flies from a wild populationto associate naturally segregating candidate SNPs with theresponse to low humidity For each treatment (resistantand control) genomic DNA was extracted from 50 femaleheads using the Qiagen DNeasy Blood and Tissue Kit accord-ing to the manufacturerrsquos instructions (Qiagen HildenGermany) Two extractions were combined to create fivepools per treatment which were barcoded separately tocreate technical replicates (total 10 pools of 100 fliesn frac14 1000) The DNA was fragmented using a CovarisS2 ma-chine (Covaris Inc Woburn MA) and libraries were pre-pared from 1 mg genomic DNA (per pool of 100 flies) usingthe Illumina TruSeq Prep module (Illumina San Diego CA)To minimize the effect of variation across lanes the ten poolswere combined in equal concentration (fig 3) and run mul-tiplexed in each lane of a full flow cell of an Illumina HiSeq2000 sequencer Clusters were generated using the TruSeq PECluster Kit v5 on a cBot and sequenced using Illumina TruSeqSBS v3 chemistry (Illumina) Fragmentation library construc-tion and sequencing were performed at the Micromon Next-Gen Sequencing facility (Monash University ClaytonAustralia)
Fig 3 Experimental design See text for further explanation
10
Data Processing and Mapping
Processing was performed on a high-performance computingcluster using the Rubra pipeline system that makes use of theRuffus Python library (Goodstadt 2010) The final variant-calling pipeline was based on a pipeline developed by ClareSloggett (Sloggett et al 2014) and is publicly available athttpsgithubcomgriffinpGWAS_pipeline (last accessedJanuary 13 2016) Raw sequence reads were processed persample per lane with P1 and P2 read files processed sepa-rately initially Trimming was performed with trimmomatic v030 (Lohse et al 2012) Adapter sequences were also re-moved Leading and trailing bases with a quality scorebelow 30 were trimmed and a sliding window (width frac14 10quality threshold frac14 25) was used Reads shorter than 40 bpafter trimming were discarded Quality before and after trim-ming was examined using FastQC v 0101 (Andrews 2014)
Posttrimming reads that remained paired were used asinput into the alignment step Alignment to the D melano-gaster reference genome version r553 was performed withbwa v 062 (Li and Durbin 2009) using the bwa aln commandand the following options No seeding (-l 150) 1 rate ofmissing alignments given 2 uniform base error rate (-n001) a maximum of 2 gap opens per read (-o 2) a maximumof 12 gap extensions per read (-e 12) and long deletionswithin 12 bp of 30 end disallowed (-d 12) The default valueswere used for all other options Output files were then con-verted to sorted bam files using the bwa sampe commandand the ldquoSortSamrdquo option in Picard v 196
At this point the bam files were merged into one file persample using PicardMerge Duplicates were identified withPicard MarkDuplicates The tool RealignerTargetCreator inGATK v 26-5 (DePristo et al 2011) was applied to the tenbam files simultaneously producing a list of intervals contain-ing probable indels around which local realignment would beperformed This was used as input for the IndelRealigner toolwhich was also applied to all ten bam files simultaneously
To investigate the variation among technical replicatesallele frequency was estimated based on read counts foreach technical replicate separately for 100000 randomly cho-sen SNPs across the genome After excluding candidate des-iccation-resistance SNPs and SNPs that may have been falsepositives due to sequencing error (those with mean frequencylt004 across replicates) 1803 SNPs remained For each locuswe calculated the variance in allele frequency estimate amongthe five control replicates and the variance among the fivedesiccation-resistance replicates We compared this with thevariance due to pool category (control vs desiccation resis-tant) using analysis of variance (ANOVA) We also calculatedthe mean pairwise difference in allele frequency estimateamong control and among desiccation-resistance replicatesand the mean concordance correlation coefficient was calcu-lated over all pairwise comparisons within pool category(using the epiccc function in the epiR package Stevensonet al 2015)
The following approach was then taken to identify SNPsdiffering between the desiccation-resistant and controlgroups Based on the technical replicate analysis we merged
the five ldquodesiccation-resistantrdquo and the five random controlsamples into two bam files A pileup file was created withsamtools v 0119 (Li et al 2009) and converted to a ldquosyncrdquo fileusing the mpileup2sync script in Popoolation2 v 1201 (Kofleret al 2011) retaining reads with a mapping quality of at least20 and bases with a minimum quality score of 20 Reads withunmapped mates were also retained (option ndashA in samtoolsmpileup) For this step we used a repeat-masked version ofthe reference genome created with RepeatMasker 403 (Smitet al 2013-2015) and a repeat file containing all annotatedtransposons from D melanogaster including shared ancestralsequences (Flybase release r557) plus all repetitive elementsfrom RepBase release v1904 for D melanogaster Simple re-peats small RNA genes and bacterial insertions were notmasked (options ndashnolow ndashnorna ndashno_is) The rmblastsearch engine was used
Association Analysis
We then performed Fisherrsquos exact test with the Fisher testscript in Popoolation2 (Kofler et al 2011) We set the min-imum count (of the minor allele) frac14 10 minimum coverage(in both populations) frac14 20 and maximum coverage frac1410000 The test was performed for each SNP (slidingwindow off) Although there are numerous approaches tocalculating genome-wide significance thresholds in aPoolseq framework currently there is no consensus on anappropriate standardized statistical method Methods in-clude simulation approaches (Orozco-terWengel et al2012 Bastide et al 2013 Burke et al 2014) correctionusing a null distribution based on estimating genetic driftfrom replicated artificial selection line allele frequency vari-ance (Turner and Miller 2012) use of a simple thresholdbased on a minimum allele frequency difference (Turneret al 2010) and the FDR correction suggested by Storeyand Tibshirani (2003) (used by Fabian et al 2012)
We determined based on our design to calculate the FDRthreshold by comparison with a simulated distribution of nullP values following the approach of Bastide et al (2013) Foreach SNP locus a P value was calculated by simulating thedistribution of read counts between the major and minorallele according to a beta-binomial distribution with meana(a+b) and variance (ab)((a+b)2(a+b+1)) Thismodel allowed for two types of sampling variationStochastic variation in allele sampling across the phenotypesand the background variation in the representation of allelesin each pool caused by library preparation artifacts or sto-chastic variation driven by unequal coverage of the poolsThese sources of variation have also been recognized else-where as being important in interpreting pooled sequenceallele calls and identifying significant differentiation betweenpooled samples (Lynch et al 2014) The value of afrac14 37 waschosen by chi-square test as the best match to the observedP value distribution (afrac14 10 to afrac14 40 were tested) with acorresponding bfrac14 4315 to reflect the coverage depth differ-ence between the random control (487) and desiccation-resistant (568) pools These values were then used tosimulate a null data set 10 the size of the observed data
11
set Because the null P value distribution still did not fit theobserved P value distribution particularly well (supplemen-tary fig S2 Supplementary Material online) these values werenot used for a standard FDR correction Instead a P valuethreshold was calculated based on the lowest 005 of thenull P values similar in approach to Orozco-terWengel et al(2012)
Inversion Analysis
In D melanogaster several cosmopolitan inversion polymor-phisms show cross-continent latitudinal patterns that explaina substantial proportion of clinal variation for many traitsincluding thermal tolerance (reviewed in Hoffmann andWeeks 2007) Patterns of disequilibria generated betweenloci in the vicinity of the rearrangement can obscure signalsof selection (Hoffmann and Weeks 2007) and inversion fre-quencies are routinely considered in genomic studies ofnatural populations sampled from known latitudinal clines(Fabian et al 2012 Kapun et al 2014) We assessedassociations between inversion frequencies and our controland resistant pools at SNPs diagnostic for the following in-versions In3R(Payne) (Anderson et al 2005 Kapun et al2014) In(2L)t (Andolfatto et al 1999 Kapun et al 2014)In(2R)Ns In(3L)P In(3R)C In(3R)K and In(3R)Mo (Kapunet al 2014) Between one and five diagnostic SNPs were in-vestigated for each inversion (depending on availability) Weconsidered an inversion to be associated with desiccationresistance if the frequency of its diagnostic SNP differed sig-nificantly between the control and desiccation-resistant poolsby a Fisherrsquos exact test
Genome Feature Annotation
The list of candidate SNPs was converted to vcf format using acustom R script setting the alternate allele as the allele thatincreased in frequency from the control to the desiccation-resistant pool and setting the reference allele as the otherallele present in cases where both the major and minor allelesdiffered from the reference genome The genomic features inwhich the candidate SNPs occurred were annotated usingsnpEff v40 first creating a database from the D melanogasterv553 genome annotation with the default approach(Cingolani et al 2012) The default annotation settings wereused except for setting the parameter ldquo-ud 0rdquo to annotateSNPs to genes only if they were within the gene body Using2 tests on SNP counts we tested for over- or under-enrichment of feature types by calculating the proportionof lsquoallrsquo features from all SNPs detected in the study and theproportion of ldquocandidaterdquo SNPs identified as differentiated indesiccation-resistant flies (Fabian et al 2012)
GO Analysis
GO analyses was used to associate the candidates with theirbiological functions (Ashburner et al 2000) Previously devel-oped for biological pathway analysis of global gene expressionstudies (Wang et al 2010) typical GO analysis assumes that allgenes are sampled independently with equal probabilityGWAS violate these criteria where SNPs are more likely to
occur in longer genes than in shorter genes resulting in highertype I error rates in GO categories overpopulated with longgenes Gene length as well as overlapping gene biases (Wanget al 2011) were accounted for using Gowinda (Kofler andSchleurootterer 2012) which implements a permutation strategyto reduce false positives Gowinda does not reconstruct LDbetween SNPs but operates in two modes underpinned bytwo extreme assumptions 1) Gene mode (assumes all SNPs ina gene are in LD) and 2) SNP mode (assumes all SNPs in agene are completely independent)
The candidates were examined in both modes but giventhe stringency of gene mode high-resolution SNP modeoutput was utilized for the final analysis with the followingparameters 106 simulations minimum significance frac141 andminimum genes per GO categoryfrac14 3 (to exclude gene-poorcategories) P values were corrected using the Gowinda FDRalgorithm GO annotations from Flybase r557 were applied
Candidate Gene Analysis
In conjunction with the GO analysis we also examinedknown candidate genes associated with survival under lowRH to better characterize possible biological functions of ourGWAS candidates We took a conservative approach to can-didate gene assignment where the SNPs were mapped withinthe entire gene region without up- and downstream exten-sions Genes were curated from FlyBase and the literature(table 1) predominantly using detailed functional experimen-tal evidence as criteria for functional desiccation candidates
Network Analysis
We next obtained a higher level summary of the candidategene list using PPI network analysis The full candidate genelist was used to construct a first-order interaction networkusing all Drosophila PPIs listed in the Drosophila InteractionDatabase v 2014_10 (avoiding interologs) (Murali et al 2011)This ldquofullrdquo PPI network includes manually curated data frompublished literature and experimental data for 9633 proteinsand 93799 interactions For the network construction theigraph R package was used to build a subgraph from the fullPPI network containing the candidate proteins and their first-order interacting proteins (Csardi and Nepusz 2006) Self-connections were removed Functional enrichment analysiswas conducted on the list of network genes using FlyMinev400 (wwwflymineorg last accessed January 13 2016) withthe following parameters for GO enrichment and KEGGReactome pathway enrichment BenjaminindashHochberg FDRcorrection P lt 005 background frac14 all D melanogastergenes in our full PPI network and Ontology (for GO enrich-ment only) frac14 biological process or molecular function
Cross-Study Comparison Overlap with Our PreviousWork
Following on from the comparison between our gene list andthe candidate genes suggested from the literature we quan-titatively evaluated the degree of overlap between the candi-date genes identified in this study with alleles mapped in ourprevious study of flies collected from the same geographic
12
region (Telonis-Scott et al 2012) The ldquooverlaprdquo analysis wasperformed similarly for the GWAS candidate gene and net-work-level approaches First the gene list overlap was testedbetween the two studies using Fisherrsquos exact test where the382 genes identified from the candidate SNPs (this study)were compared with 416 genes identified from candidatesingle feature polymorphisms (Telonis-Scott et al 2012)The contingency table was constructed using the numberof annotated genes in the D melanogaster R553 genomebuilt as an approximation for the total number of genestested in each study and took the following form
where A frac14 gene list from Telonis-Scott et al (2012)n frac14 416 B frac14 gene list from this study n frac14 382 N frac14 totalnumber of genes in D melanogaster reference genome r553n frac14 17106
To annotate genome features in the overlap candidategene list we used the SNP data (this study) which betterresolves alleles than array-based genotyping Due to this lim-itation we did not compare exact genomic coordinates be-tween the studies but considered overlap to occur whendifferentiated alleles were detected in the same gene inboth studies We did however examine the allele frequenciesaround the Dys-RF promoter which was sequenced sepa-rately revealing a selective sweep in the experimental evolu-tion lines (Telonis-Scott et al 2012)
The overlapping candidates were tested for functional en-richment using a basic gene list approach As genes ratherthan SNPs were analyzed here Fly Mine v400 (wwwflymineorg) was applied with the following parameters Gene lengthnormalization BenjaminindashHochberg FDR correction Plt 005Ontology frac14 biological process and default background (allGO-annotated D melanogaster genes)
Finally we constructed a second PPI network from thecandidate genes mapped in Telonis-Scott et al (2012) to ex-amine system-level overlap between the two studies The 416genes mapped to 337 protein seeds and the network wasconstructed as described above The degree of network over-lap between the two studies was tested using simulation Ineach iteration a gene set of the same length as the list fromthis study and a gene set of the same length as the list fromTelonis-Scott et al (2012) were resampled randomly from theentire D melanogaster mapped gene annotation (r553) Eachresampled set was then used to build a first-order network aspreviously described The overlap network between the tworesampled sets was calculated using the graphintersectioncommand from the igraph package and zero-degree nodes(proteins with no interactions) were removed Overlap wasquantified by counting the number of nodes and the numberof edges in this overlap network The observed overlap wascompared with a simulated null distribution for both nodeand edge number where n frac14 1000 simulation iterations
Supplementary MaterialSupplementary figures S1ndashS4 and tables S1ndashS5 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
We thank Robert Good for the fly sampling and JenniferShirriffs and Lea and Alan Rako for outstanding assistancewith laboratory culture and substantial desiccation assaysWe are grateful to Melissa Davis for advice on network build-ing and comparison approaches and to five anonymous re-viewers whose comments improved the manuscript Thiswork was supported by a DECRA fellowship DE120102575and Monash University Technology voucher scheme toMTS an ARC Future Fellowship and Discovery Grants toCMS an ARC Laureate Fellowship to AAH and a Scienceand Industry Endowment Fund grant to CMS and AAH
ReferencesAlexander JM Lomvardas S 2014 Nuclear architecture as an epigenetic
regulator of neural development and function Neuroscience26439ndash50
Anderson AR Hoffmann AA McKechnie SW Umina PA Weeks AR2005 The latitudinal cline in the In(3R)Payne inversion polymor-phism has shifted in the last 20 years in Australian Drosophilamelanogaster populations Mol Ecol 14851ndash858
Andolfatto P Wall JD Kreitman M 1999 Unusual haplotype structureat the proximal breakpoint of In(2L)t in a natural population ofDrosophila melanogaster Genetics 1531297ndash1311
Andrews S 2014 FastQC Available from httpwwwbioinformaticsbbsrcacukprojectsfastqc
Ashburner M Ball CA Blake JA Botstein D Butler H Cherry JM DavisAP Dolinski K Dwight SS Eppig JT et al 2000 Gene ontology toolfor the unification of biology The Gene Ontology Consortium NatGenet 2525ndash29
Aytes A Mitrofanova A Lefebvre C Alvarez Mariano J Castillo-MartinM Zheng T Eastham James A Gopalan A Pienta Kenneth J ShenMM et al 2014 Cross-species regulatory network analysis identifiesa synergistic interaction between FOXM1 and CENPF that drivesprostate cancer malignancy Cancer Cell 25638ndash651
Bastide H Betancourt A Nolte V Tobler R Steuroobe P Futschik ASchleurootterer C 2013 A genome-wide fine-scale map of natural pig-mentation variation in Drosophila melanogaster PLoS Genet9e1003534
Bergland AO Behrman EL OrsquoBrien KR Schmidt PS Petrov DA 2014Genomic evidence of rapid and stable adaptive oscillations overseasonal time scales in Drosophila PLoS Genet 10e1004775
Bonebrake TC Mastrandrea MD 2010 Tolerance adaptation and pre-cipitation changes complicate latitudinal patterns of climate changeimpacts Proc Natl Acad Sci U S A 10712581ndash12586
Bradley TJ 2009 Animal osmoregulation Oxford Oxford UniversityPress
Burke MK Liti G Long AD 2014 Standing genetic variation drivesrepeatable experimental evolution in outcrossing populations ofSaccharomyces cerevisiae Mol Biol Evol 313228ndash3239
Chown SL 2012 Trait-based approaches to conservation physiologyforecasting environmental change risks from the bottom upPhilos Trans R Soc B 3671615ndash1627
Chown SL Nicolson SW 2004 Insect physiological ecology mechanismsand patterns Oxford Oxford University Press
Chown SL Soslashrensen JG Terblanche JS 2011 Water loss in insectsan environmental change perspective J Insect Physiol571070ndash1084
Not in GWAS In GWAS
Not in Telonis-Scott et al (2012) N ndash intersect-AndashB A-intersect
In Telonis-Scott et al (2012) B-intersect Intersect(AB)
13
Chung H Loehlin DW Dufour HD Vaccarro K Millar JG Carroll SB2014 A single gene affects both ecological divergence and matechoice in Drosophila Science 3431148ndash1151
Cingolani P Platts A Wang LL Coon M Nguyen T Wang L Land SJ LuX Ruden DM 2012 A program for annotating and predicting theeffects of single nucleotide polymorphisms SnpEff SNPs in thegenome of Drosophila melanogaster strain w1118 iso-2 iso-3 Fly(Austin) 680ndash92
Civelek M Lusis AJ 2014 Systems genetics approaches to understandcomplex traits Nat Rev Genet 1534ndash48
Clusella-Trullas S Blackburn TM Chown SL 2011 Climatic predictors oftemperature performance curve parameters in ectotherms implycomplex responses to climate change Am Nat 177738ndash751
Csardi G Nepusz T 2006 The igraph software package for complexnetwork research InterJournal Complex Systems 1695 Availablefrom httpigraphorg
Davies SA Cabrero P Overend G Aitchison L Sebastian S Terhzaz SDow JA 2014 Cell signalling mechanisms for insect stress toleranceJ Exp Biol 217119ndash128
Davies SA Cabrero P Povsic M Johnston NR Terhzaz S Dow JA 2013Signaling by Drosophila capa neuropeptides Gen Comp Endocrinol18860ndash66
Davies SA Overend G Sebastian S Cundall M Cabrero P Dow JATerhzaz S 2012 Immune and stress response lsquocross-talkrsquo in theDrosophila Malpighian tubule J Insect Physiol 58488ndash497
Day JP Dow JA Houslay MD Davies SA 2005 Cyclic nucleotide phos-phodiesterases in Drosophila melanogaster Biochem J 388333ndash342
de Nadal E Ammerer G Posas F 2011 Controlling gene expression inresponse to stress Nat Rev Genet 12833ndash845
DePristo MA Banks E Poplin R Garimella KV Maguire JR Hartl CPhilippakis AA del Angel G Rivas MA Hanna M et al 2011 Aframework for variation discovery and genotyping using next-gen-eration DNA sequencing data Nat Genet 43491ndash498
Djawdan M Chippindale AK Rose MR Bradley TJ 1998 Metabolic re-serves and evolved stress resistance in Drosophila melanogasterPhysiol Zool 71584ndash594
Durham MF Magwire MM Stone EA Leips J 2014 Genome-wide anal-ysis in Drosophila reveals age-specific effects of SNPs on fitness traitsNat Commun 54338
Emilsson V Thorleifsson G Zhang B Leonardson AS Zink F Zhu JCarlson S Helgason A Walters GB Gunnarsdottir S et al 2008Genetics of gene expression and its effect on disease Nature452423ndash428
Fabian DK Kapun M Nolte V Kofler R Schmidt PS Schleurootterer C FlattT 2012 Genome-wide patterns of latitudinal differentiation amongpopulations of Drosophila melanogaster from North America MolEcol 214748ndash4769
Foley BR Telonis-Scott M 2011 Quantitative genetic analysis suggestscausal association between cuticular hydrocarbon composition anddesiccation survival in Drosophila melanogaster Heredity 10668ndash77
Folk DG Han C Bradley TJ 2001 Water acquisition and partitioning inDrosophila melanogaster effects of selection for desiccation-resis-tance J Exp Biol 2043323ndash3331
Fowler MA Montell C 2013 Drosophila TRP channels and animal be-havior Life Sci 92394ndash403
Fu J Keurentjes JJB Bouwmeester H America T Verstappen FWA WardJL Beale MH de Vos RCH Dijkstra M Scheltema RA et al 2009System-wide molecular evidence for phenotypic buffering inArabidopsis Nat Genet 41166ndash167
Futschik A Schleurootterer C 2010 The next generation of molecular mar-kers from massively parallel sequencing of pooled DNA samplesGenetics 186207ndash218
Gibbs AG Fukuzato F Matzkin LM 2003 Evolution of water conserva-tion mechanisms in Drosophila J Exp Biol 2061183ndash1192
Gibbs AG Matzkin LM 2001 Evolution of water balance in the genusDrosophila J Exp Biol 2042331ndash2338
Goodstadt L 2010 Ruffus a lightweight Python library for computa-tional pipelines Bioinformatics 262778ndash2779
Hadley NF 1994 Water relations of terrestrial arthropods San Diego(CA) Academic Press
Hoffmann AA Hallas R Sinclair C Mitrovski P 2001 Levels of variationin stress resistance in Drosophila among strains local populationsand geographic regions patterns for desiccation starvation coldresistance and associated traits Evolution 551621ndash1630
Hoffmann AA Parsons PA 1989a An integrated approach to environ-mental stress tolerance and life-history variation desiccation toler-ance in Drosophila Biol J Linn Soc 37117ndash136
Hoffmann AA Parsons PA 1989b Selection for increased desiccationresistance in Drosophila melanogaster additive genetic control andcorrelated responses for other stresses Genetics 122837ndash845
Hoffmann AA Sgro CM 2011 Climate change and evolutionary adap-tation Nature 470479ndash485
Hoffmann AA Weeks AR 2007 Climatic selection on genes and traitsafter a 100 year-old invasion a critical look at the temperate-tropicalclines in Drosophila melanogaster from eastern Australia Genetica129133ndash147
Huang W Richards S Carbone MA Zhu D Anholt RR Ayroles JFDuncan L Jordan KW Lawrence F Magwire MM et al 2012Epistasis dominates the genetic architecture of Drosophila quanti-tative traits Proc Natl Acad Sci U S A 10915553ndash15559
Huang Z Tunnacliffe A 2004 Response of human cells to desiccationcomparison with hyperosmotic stress response J Physiol558181ndash191
Huang Z Tunnacliffe A 2005 Gene induction by desiccation stress inhuman cell cultures FEBS Lett 5794973ndash4977
Ichihashi Y Aguilar-Martınez JA Farhi M Chitwood DH Kumar RMillon LV Peng J Maloof JN Sinha NR 2014 Evolutionary develop-mental transcriptomics reveals a gene network module regulatinginterspecific diversity in plant leaf shape Proc Natl Acad Sci U S A111E2616ndashE2621
Itoh M Nanba N Hasegawa M Inomata N Kondo R Oshima MTakano-Shimizu T 2010 Seasonal changes in the long-distance link-age disequilibrium in Drosophila melanogaster J Hered 10126ndash32
Johnson WA Carder JW 2012 Drosophila nociceptors mediate larvalaversion to dry surface environments utilizing both the painless TRPchannel and the DEGENaC subunit PPK1 PLoS One 7e32878
Kahsai L Kapan N Dircksen H Winther AM Nassel DR 2010 Metabolicstress responses in Drosophila are modulated by brain neurosecre-tory cells that produce multiple neuropeptides PLoS One 5e11480
Kapun M van Schalkwyk H McAllister B Flatt T Schleurootterer C 2014Inference of chromosomal inversion dynamics from Pool-Seq datain natural and laboratory populations of Drosophila melanogasterMol Ecol 231813ndash1827
Kawano T Shimoda M Matsumoto H Ryuda M Tsuzuki S Hayakawa Y2010 Identification of a gene Desiccate contributing to desicca-tion resistance in Drosophila melanogaster J Biol Chem28538889ndash38897
Kellermann V van Heerwaarden B Sgro CM Hoffmann AA 2009Fundamental evolutionary limits in ecological traits driveDrosophila species distributions Science 3251244ndash1246
Kofler R Nolte V Schleurootterer C 2016 The impact of library preparationprotocols on the consistency of allele frequency estimates in Pool-Seq data Mol Ecol Resour 16118ndash122
Kofler R Pandey RV Schleurootterer C 2011 PoPoolation2 identifying dif-ferentiation between populations using sequencing of pooled DNAsamples (Pool-Seq) Bioinformatics 273435ndash3436
Kofler R Schleurootterer C 2012 Gowinda unbiased analysis of gene setenrichment for genome-wide association studies Bioinformatics282084ndash2085
Leiserson MDM Eldridge JV Ramachandran S Raphael BJ 2013Network analysis of GWAS data Curr Opin Genet Dev 23602ndash610
Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 251754ndash1760
Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079
14
Liu Y Luo J Carlsson MA Neuroassel DR 2015 Serotonin and insulin-likepeptides modulate leucokinin-producing neurons that affectfeeding and water homeostasis in Drosophila J Comp Neurol5231840ndash1863
Lohse M Bolger AM Nagel A Fernie AR Lunn JE Stitt M Usadel B 2012RobiNA a user-friendly integrated software solution for RNA-Seq-based transcriptomics Nucleic Acids Res 40W622ndashW627
Long A Liti G Luptak A Tenaillon O 2015 Elucidating the moleculararchitecture of adaptation via evolve and resequence experimentsNat Rev Genet 16567ndash582
Lynch M Bost D Wilson S Maruki T Harrison S 2014 Population-genetic inference from pooled-sequencing data Genome Biol Evol61210ndash1218
Mackay TFC Stone EA Ayroles JF 2009 The genetics of quantitativetraits challenges and prospects Nat Rev Genet 10565ndash577
Marron MT Markow TA Kain KJ Gibbs AG 2003 Effects of starvationand desiccation on energy metabolism in desert and mesicDrosophila J Insect Physiol 49261ndash270
Matzkin LM Markow TA 2009 Transcriptional regulation of metabo-lism associated with the increased desiccation resistance of thecactophilic Drosophila mojavensis Genetics 1821279ndash1288
Murali T Pacifico S Yu J Guest S Roberts GG Finley RL Jr 2011 DroID2011 a comprehensive integrated resource for protein transcrip-tion factor RNA and gene interactions for Drosophila Nucleic AcidsRes 39D736ndashD743
Nuzhdin SV Turner TL 2013 Promises and limitations of hitchhikingmapping Curr Opin Genet Dev 23694ndash699
Orozco-terWengel P Kapun M Nolte V Kofler R Flatt T Schleurootterer C2012 Adaptation of Drosophila to a novel laboratory environmentreveals temporally heterogeneous trajectories of selected alleles MolEcol 214931ndash4941
Picard Picard A set of Java command line tools for manipulating high-throughput seqeuncing (HTS) data and formats Available fromhttpbroadinstitutegithubiopicard
Qiu Y Tittiger C Wicker-Thomas C Le Goff G Young S Wajnberg EFricaux T Taquet N Blomquist GJ Feyereisen R 2012 An insect-specific P450 oxidative decarbonylase for cuticular hydrocarbon bio-synthesis Proc Natl Acad Sci U S A 10914858ndash14863
Rajpurohit S Oliveira CC Etges WJ Gibbs AG 2013 Functional genomicand phenotypic responses to desiccation in natural populations of adesert Drosophilid Mol Ecol 222698ndash2715
Sarup P Soslashrensen JG Kristensen TN Hoffmann AA Loeschcke V PaigeKN Soslashrensen P 2011 Candidate genes detected in transcriptomestudies are strongly dependent on genetic background PLoS One6e15644
Schiffer M Hangartner S Hoffmann AA 2013 Assessing the relativeimportance of environmental effects carry-over effects and speciesdifferences in thermal stress resistance a comparison ofDrosophilids across field and laboratory generations J Exp Biol2163790ndash3798
Schleurootterer C Kofler R Versace E Tobler R Franssen SU 2015Combining experimental evolution with next-generation sequen-cing A powerful tool to study adaptation from standing geneticvariation Heredity 114431ndash440
Sinclair BJ Gibbs AG Roberts SP 2007 Gene transcription during ex-posure to and recovery from cold and desiccation stress inDrosophila melanogaster Insect Mol Biol 16435ndash443
Sloggett C Wakefield M Philip G Pope B 2014 Rubramdashflexible distrib-uted pipelines figshare Available from httpdxdoiorg106084m9figshare895626
Smit AFA Hubley R Green P 2013-2015 RepeatMasker Open-403Available from httpwwwrepeatmaskerorg
Seurooderberg JA Birse RT Neuroassel DR 2011 Insulin production and signalingin renal tubules of Drosophila is under control of tachykinin-relatedpeptide and regulates stress resistance PLoS One 6e19866
Soslashrensen JG Nielsen MM Loeschcke V 2007 Gene expression profileanalysis of Drosophila melanogaster selected for resistance to envi-ronmental stressors J Evol Biol 201624ndash1636
St Pierre SE Ponting L Stefancsik R McQuilton P 2014 FlyBase102mdashadvanced approaches to interrogating FlyBase Nucleic AcidsRes 42D780ndashD788
Stevenson M Nunes T Heuser C Marshall J Sanchez J Thornton RReiczigel J Robison-Cox J Sebastiani P Solymos P et al 2015 epiRTools for the analysis of epidemiological data R package version09-62 Available from httpCRANR-projectorgpackagefrac14epiR
Storey JD Tibshirani R 2003 Statistical significance for genomewidestudies Proc Natl Acad Sci U S A 1009440ndash9445
Telonis-Scott M Gane M DeGaris S Sgro CM Hoffmann AA 2012 Highresolution mapping of candidate alleles for desiccation resistancein Drosophila melanogaster under selection Mol Biol Evol291335ndash1351
Telonis-Scott M Guthridge KM Hoffmann AA 2006 A new set oflaboratory-selected Drosophila melanogaster lines for the analysisof desiccation resistance response to selection physiology and cor-related responses J Exp Biol 2091837ndash1847
Terhzaz S Cabrero P Robben JH Radford JC Hudson BD Milligan GDow JA Davies SA 2012 Mechanism and function of Drosophilacapa GPCR a desiccation stress-responsive receptor with functionalhomology to human neuromedinU receptor PLoS One 7e29897
Terhzaz S Overend G Sebastian S Dow JA Davies SA 2014 The Dmelanogaster capa-1 neuropeptide activates renal NF-kB signalingPeptides 53218ndash224
Terhzaz S Teets NM Cabrero P Henderson L Ritchie MG Nachman RJDow JA Denlinger DL Davies SA 2015 Insect capa neuropeptidesimpact desiccation and cold tolerance Proc Natl Acad Sci U S A1122882ndash2887
Thorat LJ Gaikwad SM Nath BB 2012 Trehalose as an indicator ofdesiccation stress in Drosophila melanogaster larvae a potentialmarker of anhydrobiosis Biochem Biophys Res Commun419638ndash642
Tobler R Franssen SU Kofler R Orozco-terWengel P Nolte VHermisson J Schl eurootterer C 2014 Massive habitat-specific ge-nomic response in D melanogaster populations during exper-imental evolution in hot and cold environments Mol Biol Evol31364ndash375
Turner TL Bourne EC Von Wettberg EJ Hu TT Nuzhdin SV 2010Population resequencing reveals local adaptation of Arabidopsislyrata to serpentine soils Nat Genet 42260ndash263
Turner TL Miller PM 2012 Investigating natural variation in Drosophilacourtship song by the evolve and resequence approach Genetics191633ndash642
Vermeulen CJ Soslashrensen P Kirilova Gagalova K Loeschcke V 2013Transcriptomic analysis of inbreeding depression in cold-sensitiveDrosophila melanogaster shows upregulation of the immune re-sponse J Evol Biol 261890ndash1902
Wang B Goode J Best J Meltzer J Schilman PE Chen J Garza D ThomasJB Montminy M 2008 The insulin-regulated CREB coactivatorTORC promotes stress resistance in Drosophila Cell Metab7434ndash444
Wang J Kean L Yang J Allan AK Davies SA Herzyk P Dow JA 2004Function-informed transcriptome analysis of Drosophila renaltubule Genome Biol 5R69
Wang K Li M Hakonarson H 2010 Analysing biological pathways ingenome-wide association studies Nat Rev Genet 11843ndash854
Wang L Jia P Wolfinger RD Chen X Zhao Z 2011 Gene set analysis ofgenome-wide association studies methodological issues and per-spectives Genomics 981ndash8
Weber AL Khan GF Magwire MM Tabor CL Mackay TF Anholt RR2012 Genome-wide association analysis of oxidative stress resistancein Drosophila melanogaster PLoS One 7e34745
Weeks AR McKechnie SW Hoffmann AA 2002 Dissecting adaptiveclinal variation markers inversions and sizestress associations inDrosophila melanogaster from a central field population Ecol Lett5756ndash763
Zhu Y Bergland AO Gonzalez J Petrov DA 2012 Empirical validation ofpooled whole genome population re-sequencing in Drosophila mel-anogaster PLoS One 7e41901
15
out of the 382 and 416 genes for the current and 2012 studyrespectively (Fisherrsquos exact test for overlap P lt 2 1016odds ratio frac14 590 supplementary table S4 SupplementaryMaterial online) Bearing gene length bias in mind we exam-ined the average length of genes in both studies as well as thegene overlap to the average gene length in the Drosophilagenome (supplementary fig S3 Supplementary Materialonline) Although there was bias toward longer genes in theoverlap list this was also apparent in the full gene lists fromboth studies
The overlap genes mapped to all of the major chromo-somes and their differentiated variants were mostly noncod-ing SNPs located in introns (supplementary table S4Supplementary Material online) Two SNPs were synonymous(Strn-Mlick coding for a protein kinase and Cht7 chitinbinding supplementary table S4 Supplementary Materialonline) while a 30UTR variant was predicted to putativelyalter protein structure in a nondisruptive manner mostlikely impacting expression andor protein translational effi-ciency (jbug response to stimulus supplementary table S4Supplementary Material online) One nonsynonymous SNPwas predicted to alter protein function as a missense variant(fab1 signaling supplementary table S4 SupplementaryMaterial online) The increase in frequency of the favoreddesiccation alleles compared with the controls ranged from5 to 27 with a median increase of 164
Despite the strong overlap there were important differ-ences between the two studies For example our previouswork revealed evidence of a selective sweep in the 50 pro-moter region of the Dys gene in the artificially selected linesand we confirmed a higher frequency of the selected SNPs inan independent natural association study in resistant fliesfrom Coffs Harbour (Telonis-Scott et al 2012) In this studywe found no differentiation at the Dys locus between ourresistant and random controls However both resistant andcontrol pools had a high frequency of the selected allele de-tected in Telonis-Scott et al (2012) (supplementary table S5Supplementary Material online)
The 45 genes common to both studies were analyzed forbiological function while many genes are obviously pleiotro-pic (ie assigned multiple GO terms) almost half were in-volved in response to stimuli (cellular behavioral andregulatory) including signaling such as Rho protein signaltransduction and cAMP-mediated signaling (table 3) Othercategories of interest include spiracle morphogenesis sensoryorgan development including development of the compoundeye and stem cell and neuroblast differentiation (table 3)
With regard to specific stress responsedesiccation candi-dates (table 1) 4 of the 45 genes overlapped including thecAMPcGMP signaling phosphodiesterases pde9 and dncMAPK pathway genes Mtl and klu indicated in NF-kBorthologue signaling in the renal tubules (table 1)Furthermore there was overlap among mechanistic catego-ries where different genes in the same pathways weredetected For example additional key cAMPcGMP hydrolyz-ing phosphodiesterases pde1c and pde11 were significant inTelonis-Scott et al (2012) as well as slpr (MAPK pathway)and poly (insulin signaling) (table 1)
Table 2 Pathway Enrichment of the GWAS and Telonis-Scott et alFirst-Order PPI Networks Investigated Using FlyMine (FDR P lt 005)
Current Study GWAS Telonis-Scott et al(2012)
Database Pathway Pathway
KEGG Ribosome RibosomeNotch signaling pathway Notch signaling pathwayMAPK signaling pathwayDorso-ventral axis formationProgesterone-mediated oocyte
maturationWnt signaling pathwayEndocytosisProteasome
Reactome Immune system Immune systemAdaptive immune system Adaptive immune systemInnate immune system Innate immune systemSignaling by interleukins Signaling by interleukins(Toll-like receptor cascades) Toll-like receptor
cascadesTCR signaling(B cell receptor signaling)
Gene expression Gene expressionMetabolism of RNA Metabolism of RNANonsense-mediated decay Nonsense-mediated decayTranscriptionmRNA processing(RNA transport)Metabolism of proteins (Metabolism of
proteins)Translation (Translation)Cell cycle Cell cycle
Cell cycle checkpointsRegulation of mitotic cell
cycleMitotic metaphase and
anaphaseG0 and early G1 G0 and early G1
G1 phase(G1S transition)G2M transition(DNA replicationS
phase)DNA repairDouble-strand break
repairSignaling pathways Signaling pathwaysInsulin signaling pathwaySignaling by NGF Signaling by NGF
Signaling by FGFRSignaling by NOTCHSignaling by WntSignaling by EGFRDevelopmentAxon guidanceMembrane traffickingGap junction trafficking
and regulationProgramed cell deathApoptosisHemostasisHemostasis
NOTEmdashKEGG and Reactome pathway databases were queried All significant termsare reported with Reactome terms grouped together under parent terms in italictext (full list of terms available in supplementary table S3 Supplementary Materialonline) Parent terms without brackets are the names of pathways that were sig-nificantly enriched parent terms in parentheses were not listed as significantlyenriched themselves but are reported as a guide to the category contents
6
In addition to the significant gene-level overlap betweenthe two studies we observed significant overlap between thePPI networks constructed from each set of candidate genes(summarized in fig 2) We examined the overlap by compar-ing the first-order networks 1) at the level of node overlap and2) at the level of edge (interaction) overlap Three hundredseven GWAS seeds formed a network of 3324 nodes whileour previous set of candidates mapped to 337 seeds forming anetwork that comprised 3215 nodes One thousand eighthundred twenty-six or ~55 of the nodes overlapped be-tween networks which was significantly higher than the nulldistribution estimated from 1000 simulations (mean simu-lated overlap SD 1304 143 nodes observed overlap atthe 999th percentile fig 2A and C) The overlap at the inter-action level was not significantly higher than expected (meansimulated overlap SD 25704 4024 edges observed over-lap 30686 edges 89th percentile fig 2B and D) Hive plotswere used to visualize the separate networks (supplementaryfig S4A and B Supplementary Material online) as well as theoverlap between the networks (supplementary fig S4CSupplementary Material online)
Given the high degree of shared nodes from distinct setsof protein seeds we compared the networks for biologicalfunction (fig 2E) Both networks produced long lists of sig-nificantly enriched GO terms (supplementary table S3Supplementary Material online) and assessing overlap was
difficult due to the nonindependent nature of GO hierar-chies However a comparison of functional pathway enrich-ment revealed some differences Although both networksshowed enrichment in the immune system pathways theribosome and Notch and NGF signaling pathways as well assome involvement of gene expression and translation path-ways the network constructed from the Telonis-Scott et al(2012) gene list showed extensive involvement of cell cyclepathway proteins as well as significant enrichment for Wntsignaling EGRF signaling DNA double-strand break repairaxon guidance gap junction trafficking and apoptosis path-ways which were not enriched in the GWAS network TheGWAS network was specifically enriched for dorso-ventralaxis formation oocyte maturation and many more geneexpression pathways than the Telonis-Scott et al (2012)gene network (table 2 and supplementary table S3Supplementary Material online)
Discussion
Genomic Signature of Desiccation Resistance fromNatural Standing Genetic Variation
We report the first large-scale genome-wide screen for nat-ural alleles associated with survival of low humidity condi-tions in D melanogaster By combining a powerful naturalassociation study with high-throughput Pool-seq we
Table 3 Overrepresentation of GO Categories for the 45 Core Candidate Genes Overlapping between the Current and Our Previous(Telonis-Scott et al 2012) Genomic Study
Term ID TermSubterms FDR P valuea Genes
Cellularbehavioral response to stimulus defense and signaling
GO0050896 Response to stimulus (cellular response tostimulus regulation of response tostimulus)
0007 Abd-B dnc fz mam nkd slo ct fru kluRbf klg santa-maria jbug fab1CG33275 Mtl Trim9 cv-c Gprk2 Ect4CG43658
GO0048585 Negative regulation of response to stimu-lus (Rho protein signal transductionsignal transduction)
0042 dnc mam slo fru Trim9 cv-c
GO0023052 Signaling (cAMP-mediated signaling cellcommunication)
0031 Amph Pde9 dnc fz mam nkd ct kluRbf santa-maria fab1 CG33275 MtlTrim9 cv-c Gprk2 Ect4 CG43658
Development
GO0048863 Stem cell differentiation 0014 Abd-D mam nkd klu nub
GO0014016 Neuroblast differentiation 0048
GO0045165 Cell fate commitment 0010 Abd-D dnc fz mam nkd ct klu klg nub
GO0035277 Spiracle morphogenesis open trachealsystem
0010 Abd-D ct cv-c
GO0007423 Sensory organ development (eye wingdisc imaginal disc)
0008 fz mam ct klu klg sdk Amph Nckx30cMtl
GO0048736 Appendage development 0020 fz mam ct fru CG33275 Mtl Fili nubcv-c Gprk2 CG43658
GO0007552 Metamorphosis 0001 fz mam ct fru CG33275 Mtl Fili cv-cGprk2 CG43658
Cellular component organization
GO0030030 Cell projection organization 0020 dnc fz ct fru GluClalpha Nckx30c MtlTrim9 nub cv-c
NOTEmdashNote that most genes are assigned to multiple annotations and therefore are shown purely as a guide to putative functionaBonferronindashHochberg FDR and gene length corrected P values using the Flymine v400 database Gene Ontology Enrichment The terms were condensed and trimmed usingREVIGO
7
mapped SNPs associated with desiccation resistance in 500naturally derived genotypes Although Pool-seq has limitedpower to reveal very low frequency alleles to reconstructphase or adequately estimate allele frequencies in smallpools (Lynch et al 2014) the Pool-GWAS approach hasbeen validated by retrieval of known candidate genes in-volved in body pigmentation using similar numbers of ge-notypes as assessed here (Bastide et al 2013)
Although studies in controlled genetic backgrounds reportlarge effects of single genes on desiccation resistance (Table 1)our current and previous screens revealed similarly largenumbers of variants (over 600) mapping to over 300 genesA polygenic basis of desiccation resistance is consistent withquantitative genetic data for the trait (Foley and Telonis-Scott2011) and we also anticipate more complexity from naturalD melanogaster populations with greater standing geneticvariation than inbred laboratory strains However somefalse positive associations are likely in the GWAS due to thelimitations of quantitative trait mapping approaches Pooledevolve-and-resequence experiments routinely yield vast num-bers of candidate SNPs partly due to high levels of standinglinkage disequilibrium (LD) and hitchhiking neutral alleles
(Schleurootterer et al 2015) Large population sizes from specieswith low levels of LD such as D melanogaster somewhat cir-cumvent this issue and a notable feature of our experimentaldesign is our sampling Here we selected the most desicca-tion-resistant phenotypes directly from substantial standinggenetic variation where one generation of recombinationshould largely reflect natural haplotypes
We observed the largest candidate allele frequency differ-ences in loci that had intermediate allele frequencies in thecontrol pool This is in line with population genetics modelswhich predict that evolution from standing genetic variationwill initially be strongest for intermediate frequency alleleswhich can facilitate rapid change (Long et al 2015) Howeverwe also observed a number of low frequency alleles which ifbeneficial can inflate false positives due to long-range LD(Orozco-terWengel et al 2012 Nuzhdin and Turner 2013Tobler et al 2014) Nonetheless spurious LD associationscannot fully account for our candidate list as evidenced bythe desiccation functional candidates and cross-study over-lap Furthermore chromosome inversion dynamics can alsogenerate both short and long-range LD and contribute nu-merous false positive SNP phenotype associations (Hoffmann
Fig 2 Observed versus simulated network overlap between the GWAS first-order PPI network and the first-order PPI network from the Telonis-Scottet al (2012) gene list Overlap is presented in terms of (A) node number and (B) edge number Networks were simulated by resampling from the entireDrosophila melanogaster gene list (r553) as described in the text The histogram shows the distribution of overlap measures for 1000 simulations Theblack line represents the histogram as density and the blue line shows the corresponding normal distribution The red vertical line shows the observedoverlap from the real networks labeled as a percentile of the normal distribution Area-proportional Venn diagrams summarized the extent of overlapfor PPI networks constructed separately from the GWAS candidates and Telonis-Scott et al (2012) candidates (labeled ldquoprevious studyrdquo) (C) Overlapexpressed as the number of nodes Approximately 55 of the nodes overlapped between the two studies which was significantly higher than simulatedexpectations (999 percentile 2B) (D) Overlap expressed in the number of interactions This was not significantly higher than expected by simulation(89th percentile 2B) (E) Overlap at the level of GO biological process terms (directly compared GO terms by name this was not tested statisticallybecause of the complex hierarchical nature of GO terms and is presented for interest)
8
and Weeks 2007 Tobler et al 2014) but we found little as-sociation between the desiccation pool and inversion fre-quency changes consistent with the lack of latitudinalpatterns in desiccation resistanceand traitinversion fre-quency associations in east Australian D melanogaster(Hoffmann et al 2001 Weeks et al 2002)
Cross-Study Comparison Reveals a Core Set ofCandidates for a Complex Trait
Our data contain many candidate genes functions and path-ways for insect desiccation resistance but partitioning genu-ine adaptive loci from experimental noise remains asignificant challenge We therefore focus our efforts on ournovel comparative analysis where we identified 45 robustcandidates for further exploration Common genetic signa-tures are rarely documented in cross-study comparisons ofcomplex traits (Sarup et al 2011 Huang et al 2012 but cfVermeulen et al 2013) and never before for desiccation re-sistance Some degree of overlap between our studies likelyresults from a shared genetic background from similar sam-pling sites highlighting the replicability of these responsesOne limitation of this approach is the overpopulation ofthe ldquocorerdquo list by longer genes This result reflects the genelength bias inherent to GWAS approaches an issue that is notstraightforward to correct in a GWAS let alone in the com-parative framework applied here A focus on the commongenes functionally linked to the desiccation phenotype canprovide a meaningful framework for future research and alsosuggests that some large-effect candidate genes exhibit natu-ral variation contributing to the phenotype (table 1)
Other feasible candidates include genes expressed inthe MTs (Wang et al 2004) or essential for MT develop-ment (St Pierre et al 2014) providing further researchopportunity focused on the primary stress sensing andfluid secretion tissues Genes include one of the mosthighly expressed MT transcripts CG7084 as well as dncNckx30C slo cv-c and nkd Other aspects of stress resis-tance are also implicated CrebB for example is involvedin regulating insulin-regulated stress resistance and ther-mosensory behavior (Wang et al 2008) nkd also impactslarval cuticle development and shares a PDZ signalingdomain also present in the desiccation candidate desi(Kawano et al 2010 St Pierre et al 2014) Finally severalcore candidates were consistently reported in recent stud-ies suggesting generalized roles in stress responses andfitness including transcriptome studies on MT function(Wang et al 2004) inbreeding and cold resistance(Vermeulen et al 2013) oxidative stress (Weber et al2012) and genomic studies exploring age-specific SNPson fitness traits (Durham et al 2014) and genes likelyunder spatial selection in flies from climatically divergenthabitats (Fabian et al 2012)
Several of the enriched GO categories from the core listalso make biological sense including stimulus response(almost half the suite) defense and signaling particularlyin pathways involved in MT stress sensing (Davies et al 20122013 2014) Functional categories enriched for sensory
organ development were highlighted consistent with envi-ronmental sensing categories from desert Drosophila tran-scriptomics (Matzkin and Markow 2009 Rajpurohit et al2013) and stress detection via phototransduction pathwaysin D melanogaster under a range of stressors including des-iccation (Soslashrensen et al 2007)
Evidence for Conserved Signatures at Higher LevelOrganization
Beyond the gene level we observed a striking degree ofsimilarity between this study and that of Telonis-Scottet al (2012) at the network level where significantly over-lapping protein networks were constructed from largely dif-ferent seed protein sets This supports a scenario where theldquobiological information flow from DNA to phenotyperdquo(Civelek and Lusis 2014) contains inherent redundancywhere alternative genetic solutions underlie phenotypesand functions Resistance to perturbation by genetic or en-vironmental variation appears to increase with increasinghierarchical complexity that is phenotype ldquobufferingrdquo (Fuet al 2009) The same functional network in different pop-ulationsgenetic backgrounds can be impacted but theexact genes and SNPs responding to perturbation will notnecessarily overlap In Drosophila different sets of loci fromthe same founders mapped to the same PPI network in thecase of two complex traits chill coma recovery and startleresponse (Huang et al 2012) Furthermore interspecificshared networkspathways from different genes have beendocumented for numerous complex traits (Emilsson et al2008 Aytes et al 2014 Ichihashi et al 2014) providing op-portunities for investigating taxa where individual genefunction is poorly understood
Although the null distribution is not always obviouswhen comparing networks and their functions partly be-cause of reduced sensitivity from missing interactions(Leiserson et al 2013) here the shared network signaturesportray a plausible multifaceted stress response involvingstress sensing rapid gene accessibility and expression Thepredominance of immune system and stress responsepathways are of interest with reference immunitystresspathway ldquocross-talkrdquo particularly in the MTs whereabiotic stressors elicit strong immunity transcriptionalresponses (Davies et al 2012) The higher order architec-tures reflect crucial aspects of rapid gene expression con-trol in response to stress from chromatin organization(Alexander and Lomvardas 2014) to transcription RNAdecay and translation (reviwed in de Nadal et al 2011) InDrosophila variation in transcriptional regulation of indi-vidual genes can underpin divergent desiccation pheno-types (Chung et al 2014) or be functionally inferred frompatterns of many genes elicited during desiccation stress(Rajpurohit et al 2013) Variants with potential to mod-ulate expression at different stages of gene expressioncontrol were evident in our GWAS data alone for exam-ple enrichment for 50UTR and splice variants at the SNPlevel and for chromatin and chromosome structure termsat the functional level
9
Genetic Overlap Versus Genetic differencesmdashCausesand Implications
Despite the common signatures at the gene and higher orderlevels between our studies pervasive differences were ob-served These can arise from differences in standing geneticvariation in the natural populations where different alleliccompositions impact gene-by-environment interactions andepistasis Our network overlap analyses highlight the potentialfor different molecular trajectories to result in similar func-tions (Leiserson et al 2013) consistent with the multiplephysiological pathways potentially available to D melanoga-ster under water stress (Chown et al 2011) Experimentaldesign also presumably contributed to the study differencesStrong directional selection as applied in the 2012 study canincrease the frequency of rare beneficial alleles and likely re-duced overall diversity than did the single-generation GWASdesign Furthermore selection regimes often impose addi-tional selective pressures (eg food deprivation during desic-cation) resulting in correlated responses such as starvationresistance with desiccation resistance and these factors differfrom the laboratory to the field Finally evidence suggests innatural D melanogaster that standing genetic variation canvary extensively with sampling season (Itoh et al 2010Bergland et al 2014) Although we cannot compare levelsof standing variation between our two studies evidence sug-gests that this may have affected the Dys gene where ourGWAS approach revealed no differentiation between thecontrol and resistant pools at this locus but rather detectedthe alleles associated with the selective sweep in Telonis-Scottet al (2012) in both pools at high frequency Whether thisresulted from sampling the later collection after a longdrought (2004ndash2008 wwwbomgovauclimate last accessedJanuary 13 2016) is unclear but this observation highlightsthe relevance of temporal sampling to standing genetic var-iation when attempting to link complex phenotypes togenotypes
Materials and Methods
Natural Population Sampling
Drosophila melanogaster was collected from vineyard waste atKilchorn Winery Romsey Australia in April 2010 Over 1200isofemale lines were established in the laboratory from wild-caught females Species identification was conducted on maleF1 to remove Drosophila simulans contamination and over900 D melanogaster isofemale lines were retained All flieswere maintained in vials of dextrose cornmeal media at25 C and constant light
Desiccation Assay
From each isofemale line ten inseminated F1 females weresorted by aspiration without CO2 and held in vials until 4ndash5days old For the desiccation assay 10 females per line (over9000 flies in total) were screened by placing groups of femalesinto empty vials covered with gauze and randomly assigninglines into 5 large sealed glass tanks containing trays of silicadesiccant (relative humidity [RH] lt10) The tanks werescored for mortality hourly and the final 558 surviving
individuals were selected for the desiccation-resistant pool(558 flies lt6 of the total) The control pool constitutedthe same number of females sampled from over 200 ran-domly chosen isofemale lines that were frozen prior to thedesiccation assay We chose this random control approachinstead of comparing the phenotypic extremes (ie the latemortality tail vs the ldquoearly mortalityrdquo tail) because unfavor-able environmental conditions experienced by the wild-caught dam may inflate environmental variance due to car-ryover effects in the F1 offspring (Schiffer et al 2013)
DNA Extraction and Sequencing
We used pooled whole-genome sequencing (Pool-seq)(Futschik and Schleurootterer 2010) to estimate and compareallele frequency differences between the most desiccation-resistant and randomly sampled flies from a wild populationto associate naturally segregating candidate SNPs with theresponse to low humidity For each treatment (resistantand control) genomic DNA was extracted from 50 femaleheads using the Qiagen DNeasy Blood and Tissue Kit accord-ing to the manufacturerrsquos instructions (Qiagen HildenGermany) Two extractions were combined to create fivepools per treatment which were barcoded separately tocreate technical replicates (total 10 pools of 100 fliesn frac14 1000) The DNA was fragmented using a CovarisS2 ma-chine (Covaris Inc Woburn MA) and libraries were pre-pared from 1 mg genomic DNA (per pool of 100 flies) usingthe Illumina TruSeq Prep module (Illumina San Diego CA)To minimize the effect of variation across lanes the ten poolswere combined in equal concentration (fig 3) and run mul-tiplexed in each lane of a full flow cell of an Illumina HiSeq2000 sequencer Clusters were generated using the TruSeq PECluster Kit v5 on a cBot and sequenced using Illumina TruSeqSBS v3 chemistry (Illumina) Fragmentation library construc-tion and sequencing were performed at the Micromon Next-Gen Sequencing facility (Monash University ClaytonAustralia)
Fig 3 Experimental design See text for further explanation
10
Data Processing and Mapping
Processing was performed on a high-performance computingcluster using the Rubra pipeline system that makes use of theRuffus Python library (Goodstadt 2010) The final variant-calling pipeline was based on a pipeline developed by ClareSloggett (Sloggett et al 2014) and is publicly available athttpsgithubcomgriffinpGWAS_pipeline (last accessedJanuary 13 2016) Raw sequence reads were processed persample per lane with P1 and P2 read files processed sepa-rately initially Trimming was performed with trimmomatic v030 (Lohse et al 2012) Adapter sequences were also re-moved Leading and trailing bases with a quality scorebelow 30 were trimmed and a sliding window (width frac14 10quality threshold frac14 25) was used Reads shorter than 40 bpafter trimming were discarded Quality before and after trim-ming was examined using FastQC v 0101 (Andrews 2014)
Posttrimming reads that remained paired were used asinput into the alignment step Alignment to the D melano-gaster reference genome version r553 was performed withbwa v 062 (Li and Durbin 2009) using the bwa aln commandand the following options No seeding (-l 150) 1 rate ofmissing alignments given 2 uniform base error rate (-n001) a maximum of 2 gap opens per read (-o 2) a maximumof 12 gap extensions per read (-e 12) and long deletionswithin 12 bp of 30 end disallowed (-d 12) The default valueswere used for all other options Output files were then con-verted to sorted bam files using the bwa sampe commandand the ldquoSortSamrdquo option in Picard v 196
At this point the bam files were merged into one file persample using PicardMerge Duplicates were identified withPicard MarkDuplicates The tool RealignerTargetCreator inGATK v 26-5 (DePristo et al 2011) was applied to the tenbam files simultaneously producing a list of intervals contain-ing probable indels around which local realignment would beperformed This was used as input for the IndelRealigner toolwhich was also applied to all ten bam files simultaneously
To investigate the variation among technical replicatesallele frequency was estimated based on read counts foreach technical replicate separately for 100000 randomly cho-sen SNPs across the genome After excluding candidate des-iccation-resistance SNPs and SNPs that may have been falsepositives due to sequencing error (those with mean frequencylt004 across replicates) 1803 SNPs remained For each locuswe calculated the variance in allele frequency estimate amongthe five control replicates and the variance among the fivedesiccation-resistance replicates We compared this with thevariance due to pool category (control vs desiccation resis-tant) using analysis of variance (ANOVA) We also calculatedthe mean pairwise difference in allele frequency estimateamong control and among desiccation-resistance replicatesand the mean concordance correlation coefficient was calcu-lated over all pairwise comparisons within pool category(using the epiccc function in the epiR package Stevensonet al 2015)
The following approach was then taken to identify SNPsdiffering between the desiccation-resistant and controlgroups Based on the technical replicate analysis we merged
the five ldquodesiccation-resistantrdquo and the five random controlsamples into two bam files A pileup file was created withsamtools v 0119 (Li et al 2009) and converted to a ldquosyncrdquo fileusing the mpileup2sync script in Popoolation2 v 1201 (Kofleret al 2011) retaining reads with a mapping quality of at least20 and bases with a minimum quality score of 20 Reads withunmapped mates were also retained (option ndashA in samtoolsmpileup) For this step we used a repeat-masked version ofthe reference genome created with RepeatMasker 403 (Smitet al 2013-2015) and a repeat file containing all annotatedtransposons from D melanogaster including shared ancestralsequences (Flybase release r557) plus all repetitive elementsfrom RepBase release v1904 for D melanogaster Simple re-peats small RNA genes and bacterial insertions were notmasked (options ndashnolow ndashnorna ndashno_is) The rmblastsearch engine was used
Association Analysis
We then performed Fisherrsquos exact test with the Fisher testscript in Popoolation2 (Kofler et al 2011) We set the min-imum count (of the minor allele) frac14 10 minimum coverage(in both populations) frac14 20 and maximum coverage frac1410000 The test was performed for each SNP (slidingwindow off) Although there are numerous approaches tocalculating genome-wide significance thresholds in aPoolseq framework currently there is no consensus on anappropriate standardized statistical method Methods in-clude simulation approaches (Orozco-terWengel et al2012 Bastide et al 2013 Burke et al 2014) correctionusing a null distribution based on estimating genetic driftfrom replicated artificial selection line allele frequency vari-ance (Turner and Miller 2012) use of a simple thresholdbased on a minimum allele frequency difference (Turneret al 2010) and the FDR correction suggested by Storeyand Tibshirani (2003) (used by Fabian et al 2012)
We determined based on our design to calculate the FDRthreshold by comparison with a simulated distribution of nullP values following the approach of Bastide et al (2013) Foreach SNP locus a P value was calculated by simulating thedistribution of read counts between the major and minorallele according to a beta-binomial distribution with meana(a+b) and variance (ab)((a+b)2(a+b+1)) Thismodel allowed for two types of sampling variationStochastic variation in allele sampling across the phenotypesand the background variation in the representation of allelesin each pool caused by library preparation artifacts or sto-chastic variation driven by unequal coverage of the poolsThese sources of variation have also been recognized else-where as being important in interpreting pooled sequenceallele calls and identifying significant differentiation betweenpooled samples (Lynch et al 2014) The value of afrac14 37 waschosen by chi-square test as the best match to the observedP value distribution (afrac14 10 to afrac14 40 were tested) with acorresponding bfrac14 4315 to reflect the coverage depth differ-ence between the random control (487) and desiccation-resistant (568) pools These values were then used tosimulate a null data set 10 the size of the observed data
11
set Because the null P value distribution still did not fit theobserved P value distribution particularly well (supplemen-tary fig S2 Supplementary Material online) these values werenot used for a standard FDR correction Instead a P valuethreshold was calculated based on the lowest 005 of thenull P values similar in approach to Orozco-terWengel et al(2012)
Inversion Analysis
In D melanogaster several cosmopolitan inversion polymor-phisms show cross-continent latitudinal patterns that explaina substantial proportion of clinal variation for many traitsincluding thermal tolerance (reviewed in Hoffmann andWeeks 2007) Patterns of disequilibria generated betweenloci in the vicinity of the rearrangement can obscure signalsof selection (Hoffmann and Weeks 2007) and inversion fre-quencies are routinely considered in genomic studies ofnatural populations sampled from known latitudinal clines(Fabian et al 2012 Kapun et al 2014) We assessedassociations between inversion frequencies and our controland resistant pools at SNPs diagnostic for the following in-versions In3R(Payne) (Anderson et al 2005 Kapun et al2014) In(2L)t (Andolfatto et al 1999 Kapun et al 2014)In(2R)Ns In(3L)P In(3R)C In(3R)K and In(3R)Mo (Kapunet al 2014) Between one and five diagnostic SNPs were in-vestigated for each inversion (depending on availability) Weconsidered an inversion to be associated with desiccationresistance if the frequency of its diagnostic SNP differed sig-nificantly between the control and desiccation-resistant poolsby a Fisherrsquos exact test
Genome Feature Annotation
The list of candidate SNPs was converted to vcf format using acustom R script setting the alternate allele as the allele thatincreased in frequency from the control to the desiccation-resistant pool and setting the reference allele as the otherallele present in cases where both the major and minor allelesdiffered from the reference genome The genomic features inwhich the candidate SNPs occurred were annotated usingsnpEff v40 first creating a database from the D melanogasterv553 genome annotation with the default approach(Cingolani et al 2012) The default annotation settings wereused except for setting the parameter ldquo-ud 0rdquo to annotateSNPs to genes only if they were within the gene body Using2 tests on SNP counts we tested for over- or under-enrichment of feature types by calculating the proportionof lsquoallrsquo features from all SNPs detected in the study and theproportion of ldquocandidaterdquo SNPs identified as differentiated indesiccation-resistant flies (Fabian et al 2012)
GO Analysis
GO analyses was used to associate the candidates with theirbiological functions (Ashburner et al 2000) Previously devel-oped for biological pathway analysis of global gene expressionstudies (Wang et al 2010) typical GO analysis assumes that allgenes are sampled independently with equal probabilityGWAS violate these criteria where SNPs are more likely to
occur in longer genes than in shorter genes resulting in highertype I error rates in GO categories overpopulated with longgenes Gene length as well as overlapping gene biases (Wanget al 2011) were accounted for using Gowinda (Kofler andSchleurootterer 2012) which implements a permutation strategyto reduce false positives Gowinda does not reconstruct LDbetween SNPs but operates in two modes underpinned bytwo extreme assumptions 1) Gene mode (assumes all SNPs ina gene are in LD) and 2) SNP mode (assumes all SNPs in agene are completely independent)
The candidates were examined in both modes but giventhe stringency of gene mode high-resolution SNP modeoutput was utilized for the final analysis with the followingparameters 106 simulations minimum significance frac141 andminimum genes per GO categoryfrac14 3 (to exclude gene-poorcategories) P values were corrected using the Gowinda FDRalgorithm GO annotations from Flybase r557 were applied
Candidate Gene Analysis
In conjunction with the GO analysis we also examinedknown candidate genes associated with survival under lowRH to better characterize possible biological functions of ourGWAS candidates We took a conservative approach to can-didate gene assignment where the SNPs were mapped withinthe entire gene region without up- and downstream exten-sions Genes were curated from FlyBase and the literature(table 1) predominantly using detailed functional experimen-tal evidence as criteria for functional desiccation candidates
Network Analysis
We next obtained a higher level summary of the candidategene list using PPI network analysis The full candidate genelist was used to construct a first-order interaction networkusing all Drosophila PPIs listed in the Drosophila InteractionDatabase v 2014_10 (avoiding interologs) (Murali et al 2011)This ldquofullrdquo PPI network includes manually curated data frompublished literature and experimental data for 9633 proteinsand 93799 interactions For the network construction theigraph R package was used to build a subgraph from the fullPPI network containing the candidate proteins and their first-order interacting proteins (Csardi and Nepusz 2006) Self-connections were removed Functional enrichment analysiswas conducted on the list of network genes using FlyMinev400 (wwwflymineorg last accessed January 13 2016) withthe following parameters for GO enrichment and KEGGReactome pathway enrichment BenjaminindashHochberg FDRcorrection P lt 005 background frac14 all D melanogastergenes in our full PPI network and Ontology (for GO enrich-ment only) frac14 biological process or molecular function
Cross-Study Comparison Overlap with Our PreviousWork
Following on from the comparison between our gene list andthe candidate genes suggested from the literature we quan-titatively evaluated the degree of overlap between the candi-date genes identified in this study with alleles mapped in ourprevious study of flies collected from the same geographic
12
region (Telonis-Scott et al 2012) The ldquooverlaprdquo analysis wasperformed similarly for the GWAS candidate gene and net-work-level approaches First the gene list overlap was testedbetween the two studies using Fisherrsquos exact test where the382 genes identified from the candidate SNPs (this study)were compared with 416 genes identified from candidatesingle feature polymorphisms (Telonis-Scott et al 2012)The contingency table was constructed using the numberof annotated genes in the D melanogaster R553 genomebuilt as an approximation for the total number of genestested in each study and took the following form
where A frac14 gene list from Telonis-Scott et al (2012)n frac14 416 B frac14 gene list from this study n frac14 382 N frac14 totalnumber of genes in D melanogaster reference genome r553n frac14 17106
To annotate genome features in the overlap candidategene list we used the SNP data (this study) which betterresolves alleles than array-based genotyping Due to this lim-itation we did not compare exact genomic coordinates be-tween the studies but considered overlap to occur whendifferentiated alleles were detected in the same gene inboth studies We did however examine the allele frequenciesaround the Dys-RF promoter which was sequenced sepa-rately revealing a selective sweep in the experimental evolu-tion lines (Telonis-Scott et al 2012)
The overlapping candidates were tested for functional en-richment using a basic gene list approach As genes ratherthan SNPs were analyzed here Fly Mine v400 (wwwflymineorg) was applied with the following parameters Gene lengthnormalization BenjaminindashHochberg FDR correction Plt 005Ontology frac14 biological process and default background (allGO-annotated D melanogaster genes)
Finally we constructed a second PPI network from thecandidate genes mapped in Telonis-Scott et al (2012) to ex-amine system-level overlap between the two studies The 416genes mapped to 337 protein seeds and the network wasconstructed as described above The degree of network over-lap between the two studies was tested using simulation Ineach iteration a gene set of the same length as the list fromthis study and a gene set of the same length as the list fromTelonis-Scott et al (2012) were resampled randomly from theentire D melanogaster mapped gene annotation (r553) Eachresampled set was then used to build a first-order network aspreviously described The overlap network between the tworesampled sets was calculated using the graphintersectioncommand from the igraph package and zero-degree nodes(proteins with no interactions) were removed Overlap wasquantified by counting the number of nodes and the numberof edges in this overlap network The observed overlap wascompared with a simulated null distribution for both nodeand edge number where n frac14 1000 simulation iterations
Supplementary MaterialSupplementary figures S1ndashS4 and tables S1ndashS5 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
We thank Robert Good for the fly sampling and JenniferShirriffs and Lea and Alan Rako for outstanding assistancewith laboratory culture and substantial desiccation assaysWe are grateful to Melissa Davis for advice on network build-ing and comparison approaches and to five anonymous re-viewers whose comments improved the manuscript Thiswork was supported by a DECRA fellowship DE120102575and Monash University Technology voucher scheme toMTS an ARC Future Fellowship and Discovery Grants toCMS an ARC Laureate Fellowship to AAH and a Scienceand Industry Endowment Fund grant to CMS and AAH
ReferencesAlexander JM Lomvardas S 2014 Nuclear architecture as an epigenetic
regulator of neural development and function Neuroscience26439ndash50
Anderson AR Hoffmann AA McKechnie SW Umina PA Weeks AR2005 The latitudinal cline in the In(3R)Payne inversion polymor-phism has shifted in the last 20 years in Australian Drosophilamelanogaster populations Mol Ecol 14851ndash858
Andolfatto P Wall JD Kreitman M 1999 Unusual haplotype structureat the proximal breakpoint of In(2L)t in a natural population ofDrosophila melanogaster Genetics 1531297ndash1311
Andrews S 2014 FastQC Available from httpwwwbioinformaticsbbsrcacukprojectsfastqc
Ashburner M Ball CA Blake JA Botstein D Butler H Cherry JM DavisAP Dolinski K Dwight SS Eppig JT et al 2000 Gene ontology toolfor the unification of biology The Gene Ontology Consortium NatGenet 2525ndash29
Aytes A Mitrofanova A Lefebvre C Alvarez Mariano J Castillo-MartinM Zheng T Eastham James A Gopalan A Pienta Kenneth J ShenMM et al 2014 Cross-species regulatory network analysis identifiesa synergistic interaction between FOXM1 and CENPF that drivesprostate cancer malignancy Cancer Cell 25638ndash651
Bastide H Betancourt A Nolte V Tobler R Steuroobe P Futschik ASchleurootterer C 2013 A genome-wide fine-scale map of natural pig-mentation variation in Drosophila melanogaster PLoS Genet9e1003534
Bergland AO Behrman EL OrsquoBrien KR Schmidt PS Petrov DA 2014Genomic evidence of rapid and stable adaptive oscillations overseasonal time scales in Drosophila PLoS Genet 10e1004775
Bonebrake TC Mastrandrea MD 2010 Tolerance adaptation and pre-cipitation changes complicate latitudinal patterns of climate changeimpacts Proc Natl Acad Sci U S A 10712581ndash12586
Bradley TJ 2009 Animal osmoregulation Oxford Oxford UniversityPress
Burke MK Liti G Long AD 2014 Standing genetic variation drivesrepeatable experimental evolution in outcrossing populations ofSaccharomyces cerevisiae Mol Biol Evol 313228ndash3239
Chown SL 2012 Trait-based approaches to conservation physiologyforecasting environmental change risks from the bottom upPhilos Trans R Soc B 3671615ndash1627
Chown SL Nicolson SW 2004 Insect physiological ecology mechanismsand patterns Oxford Oxford University Press
Chown SL Soslashrensen JG Terblanche JS 2011 Water loss in insectsan environmental change perspective J Insect Physiol571070ndash1084
Not in GWAS In GWAS
Not in Telonis-Scott et al (2012) N ndash intersect-AndashB A-intersect
In Telonis-Scott et al (2012) B-intersect Intersect(AB)
13
Chung H Loehlin DW Dufour HD Vaccarro K Millar JG Carroll SB2014 A single gene affects both ecological divergence and matechoice in Drosophila Science 3431148ndash1151
Cingolani P Platts A Wang LL Coon M Nguyen T Wang L Land SJ LuX Ruden DM 2012 A program for annotating and predicting theeffects of single nucleotide polymorphisms SnpEff SNPs in thegenome of Drosophila melanogaster strain w1118 iso-2 iso-3 Fly(Austin) 680ndash92
Civelek M Lusis AJ 2014 Systems genetics approaches to understandcomplex traits Nat Rev Genet 1534ndash48
Clusella-Trullas S Blackburn TM Chown SL 2011 Climatic predictors oftemperature performance curve parameters in ectotherms implycomplex responses to climate change Am Nat 177738ndash751
Csardi G Nepusz T 2006 The igraph software package for complexnetwork research InterJournal Complex Systems 1695 Availablefrom httpigraphorg
Davies SA Cabrero P Overend G Aitchison L Sebastian S Terhzaz SDow JA 2014 Cell signalling mechanisms for insect stress toleranceJ Exp Biol 217119ndash128
Davies SA Cabrero P Povsic M Johnston NR Terhzaz S Dow JA 2013Signaling by Drosophila capa neuropeptides Gen Comp Endocrinol18860ndash66
Davies SA Overend G Sebastian S Cundall M Cabrero P Dow JATerhzaz S 2012 Immune and stress response lsquocross-talkrsquo in theDrosophila Malpighian tubule J Insect Physiol 58488ndash497
Day JP Dow JA Houslay MD Davies SA 2005 Cyclic nucleotide phos-phodiesterases in Drosophila melanogaster Biochem J 388333ndash342
de Nadal E Ammerer G Posas F 2011 Controlling gene expression inresponse to stress Nat Rev Genet 12833ndash845
DePristo MA Banks E Poplin R Garimella KV Maguire JR Hartl CPhilippakis AA del Angel G Rivas MA Hanna M et al 2011 Aframework for variation discovery and genotyping using next-gen-eration DNA sequencing data Nat Genet 43491ndash498
Djawdan M Chippindale AK Rose MR Bradley TJ 1998 Metabolic re-serves and evolved stress resistance in Drosophila melanogasterPhysiol Zool 71584ndash594
Durham MF Magwire MM Stone EA Leips J 2014 Genome-wide anal-ysis in Drosophila reveals age-specific effects of SNPs on fitness traitsNat Commun 54338
Emilsson V Thorleifsson G Zhang B Leonardson AS Zink F Zhu JCarlson S Helgason A Walters GB Gunnarsdottir S et al 2008Genetics of gene expression and its effect on disease Nature452423ndash428
Fabian DK Kapun M Nolte V Kofler R Schmidt PS Schleurootterer C FlattT 2012 Genome-wide patterns of latitudinal differentiation amongpopulations of Drosophila melanogaster from North America MolEcol 214748ndash4769
Foley BR Telonis-Scott M 2011 Quantitative genetic analysis suggestscausal association between cuticular hydrocarbon composition anddesiccation survival in Drosophila melanogaster Heredity 10668ndash77
Folk DG Han C Bradley TJ 2001 Water acquisition and partitioning inDrosophila melanogaster effects of selection for desiccation-resis-tance J Exp Biol 2043323ndash3331
Fowler MA Montell C 2013 Drosophila TRP channels and animal be-havior Life Sci 92394ndash403
Fu J Keurentjes JJB Bouwmeester H America T Verstappen FWA WardJL Beale MH de Vos RCH Dijkstra M Scheltema RA et al 2009System-wide molecular evidence for phenotypic buffering inArabidopsis Nat Genet 41166ndash167
Futschik A Schleurootterer C 2010 The next generation of molecular mar-kers from massively parallel sequencing of pooled DNA samplesGenetics 186207ndash218
Gibbs AG Fukuzato F Matzkin LM 2003 Evolution of water conserva-tion mechanisms in Drosophila J Exp Biol 2061183ndash1192
Gibbs AG Matzkin LM 2001 Evolution of water balance in the genusDrosophila J Exp Biol 2042331ndash2338
Goodstadt L 2010 Ruffus a lightweight Python library for computa-tional pipelines Bioinformatics 262778ndash2779
Hadley NF 1994 Water relations of terrestrial arthropods San Diego(CA) Academic Press
Hoffmann AA Hallas R Sinclair C Mitrovski P 2001 Levels of variationin stress resistance in Drosophila among strains local populationsand geographic regions patterns for desiccation starvation coldresistance and associated traits Evolution 551621ndash1630
Hoffmann AA Parsons PA 1989a An integrated approach to environ-mental stress tolerance and life-history variation desiccation toler-ance in Drosophila Biol J Linn Soc 37117ndash136
Hoffmann AA Parsons PA 1989b Selection for increased desiccationresistance in Drosophila melanogaster additive genetic control andcorrelated responses for other stresses Genetics 122837ndash845
Hoffmann AA Sgro CM 2011 Climate change and evolutionary adap-tation Nature 470479ndash485
Hoffmann AA Weeks AR 2007 Climatic selection on genes and traitsafter a 100 year-old invasion a critical look at the temperate-tropicalclines in Drosophila melanogaster from eastern Australia Genetica129133ndash147
Huang W Richards S Carbone MA Zhu D Anholt RR Ayroles JFDuncan L Jordan KW Lawrence F Magwire MM et al 2012Epistasis dominates the genetic architecture of Drosophila quanti-tative traits Proc Natl Acad Sci U S A 10915553ndash15559
Huang Z Tunnacliffe A 2004 Response of human cells to desiccationcomparison with hyperosmotic stress response J Physiol558181ndash191
Huang Z Tunnacliffe A 2005 Gene induction by desiccation stress inhuman cell cultures FEBS Lett 5794973ndash4977
Ichihashi Y Aguilar-Martınez JA Farhi M Chitwood DH Kumar RMillon LV Peng J Maloof JN Sinha NR 2014 Evolutionary develop-mental transcriptomics reveals a gene network module regulatinginterspecific diversity in plant leaf shape Proc Natl Acad Sci U S A111E2616ndashE2621
Itoh M Nanba N Hasegawa M Inomata N Kondo R Oshima MTakano-Shimizu T 2010 Seasonal changes in the long-distance link-age disequilibrium in Drosophila melanogaster J Hered 10126ndash32
Johnson WA Carder JW 2012 Drosophila nociceptors mediate larvalaversion to dry surface environments utilizing both the painless TRPchannel and the DEGENaC subunit PPK1 PLoS One 7e32878
Kahsai L Kapan N Dircksen H Winther AM Nassel DR 2010 Metabolicstress responses in Drosophila are modulated by brain neurosecre-tory cells that produce multiple neuropeptides PLoS One 5e11480
Kapun M van Schalkwyk H McAllister B Flatt T Schleurootterer C 2014Inference of chromosomal inversion dynamics from Pool-Seq datain natural and laboratory populations of Drosophila melanogasterMol Ecol 231813ndash1827
Kawano T Shimoda M Matsumoto H Ryuda M Tsuzuki S Hayakawa Y2010 Identification of a gene Desiccate contributing to desicca-tion resistance in Drosophila melanogaster J Biol Chem28538889ndash38897
Kellermann V van Heerwaarden B Sgro CM Hoffmann AA 2009Fundamental evolutionary limits in ecological traits driveDrosophila species distributions Science 3251244ndash1246
Kofler R Nolte V Schleurootterer C 2016 The impact of library preparationprotocols on the consistency of allele frequency estimates in Pool-Seq data Mol Ecol Resour 16118ndash122
Kofler R Pandey RV Schleurootterer C 2011 PoPoolation2 identifying dif-ferentiation between populations using sequencing of pooled DNAsamples (Pool-Seq) Bioinformatics 273435ndash3436
Kofler R Schleurootterer C 2012 Gowinda unbiased analysis of gene setenrichment for genome-wide association studies Bioinformatics282084ndash2085
Leiserson MDM Eldridge JV Ramachandran S Raphael BJ 2013Network analysis of GWAS data Curr Opin Genet Dev 23602ndash610
Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 251754ndash1760
Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079
14
Liu Y Luo J Carlsson MA Neuroassel DR 2015 Serotonin and insulin-likepeptides modulate leucokinin-producing neurons that affectfeeding and water homeostasis in Drosophila J Comp Neurol5231840ndash1863
Lohse M Bolger AM Nagel A Fernie AR Lunn JE Stitt M Usadel B 2012RobiNA a user-friendly integrated software solution for RNA-Seq-based transcriptomics Nucleic Acids Res 40W622ndashW627
Long A Liti G Luptak A Tenaillon O 2015 Elucidating the moleculararchitecture of adaptation via evolve and resequence experimentsNat Rev Genet 16567ndash582
Lynch M Bost D Wilson S Maruki T Harrison S 2014 Population-genetic inference from pooled-sequencing data Genome Biol Evol61210ndash1218
Mackay TFC Stone EA Ayroles JF 2009 The genetics of quantitativetraits challenges and prospects Nat Rev Genet 10565ndash577
Marron MT Markow TA Kain KJ Gibbs AG 2003 Effects of starvationand desiccation on energy metabolism in desert and mesicDrosophila J Insect Physiol 49261ndash270
Matzkin LM Markow TA 2009 Transcriptional regulation of metabo-lism associated with the increased desiccation resistance of thecactophilic Drosophila mojavensis Genetics 1821279ndash1288
Murali T Pacifico S Yu J Guest S Roberts GG Finley RL Jr 2011 DroID2011 a comprehensive integrated resource for protein transcrip-tion factor RNA and gene interactions for Drosophila Nucleic AcidsRes 39D736ndashD743
Nuzhdin SV Turner TL 2013 Promises and limitations of hitchhikingmapping Curr Opin Genet Dev 23694ndash699
Orozco-terWengel P Kapun M Nolte V Kofler R Flatt T Schleurootterer C2012 Adaptation of Drosophila to a novel laboratory environmentreveals temporally heterogeneous trajectories of selected alleles MolEcol 214931ndash4941
Picard Picard A set of Java command line tools for manipulating high-throughput seqeuncing (HTS) data and formats Available fromhttpbroadinstitutegithubiopicard
Qiu Y Tittiger C Wicker-Thomas C Le Goff G Young S Wajnberg EFricaux T Taquet N Blomquist GJ Feyereisen R 2012 An insect-specific P450 oxidative decarbonylase for cuticular hydrocarbon bio-synthesis Proc Natl Acad Sci U S A 10914858ndash14863
Rajpurohit S Oliveira CC Etges WJ Gibbs AG 2013 Functional genomicand phenotypic responses to desiccation in natural populations of adesert Drosophilid Mol Ecol 222698ndash2715
Sarup P Soslashrensen JG Kristensen TN Hoffmann AA Loeschcke V PaigeKN Soslashrensen P 2011 Candidate genes detected in transcriptomestudies are strongly dependent on genetic background PLoS One6e15644
Schiffer M Hangartner S Hoffmann AA 2013 Assessing the relativeimportance of environmental effects carry-over effects and speciesdifferences in thermal stress resistance a comparison ofDrosophilids across field and laboratory generations J Exp Biol2163790ndash3798
Schleurootterer C Kofler R Versace E Tobler R Franssen SU 2015Combining experimental evolution with next-generation sequen-cing A powerful tool to study adaptation from standing geneticvariation Heredity 114431ndash440
Sinclair BJ Gibbs AG Roberts SP 2007 Gene transcription during ex-posure to and recovery from cold and desiccation stress inDrosophila melanogaster Insect Mol Biol 16435ndash443
Sloggett C Wakefield M Philip G Pope B 2014 Rubramdashflexible distrib-uted pipelines figshare Available from httpdxdoiorg106084m9figshare895626
Smit AFA Hubley R Green P 2013-2015 RepeatMasker Open-403Available from httpwwwrepeatmaskerorg
Seurooderberg JA Birse RT Neuroassel DR 2011 Insulin production and signalingin renal tubules of Drosophila is under control of tachykinin-relatedpeptide and regulates stress resistance PLoS One 6e19866
Soslashrensen JG Nielsen MM Loeschcke V 2007 Gene expression profileanalysis of Drosophila melanogaster selected for resistance to envi-ronmental stressors J Evol Biol 201624ndash1636
St Pierre SE Ponting L Stefancsik R McQuilton P 2014 FlyBase102mdashadvanced approaches to interrogating FlyBase Nucleic AcidsRes 42D780ndashD788
Stevenson M Nunes T Heuser C Marshall J Sanchez J Thornton RReiczigel J Robison-Cox J Sebastiani P Solymos P et al 2015 epiRTools for the analysis of epidemiological data R package version09-62 Available from httpCRANR-projectorgpackagefrac14epiR
Storey JD Tibshirani R 2003 Statistical significance for genomewidestudies Proc Natl Acad Sci U S A 1009440ndash9445
Telonis-Scott M Gane M DeGaris S Sgro CM Hoffmann AA 2012 Highresolution mapping of candidate alleles for desiccation resistancein Drosophila melanogaster under selection Mol Biol Evol291335ndash1351
Telonis-Scott M Guthridge KM Hoffmann AA 2006 A new set oflaboratory-selected Drosophila melanogaster lines for the analysisof desiccation resistance response to selection physiology and cor-related responses J Exp Biol 2091837ndash1847
Terhzaz S Cabrero P Robben JH Radford JC Hudson BD Milligan GDow JA Davies SA 2012 Mechanism and function of Drosophilacapa GPCR a desiccation stress-responsive receptor with functionalhomology to human neuromedinU receptor PLoS One 7e29897
Terhzaz S Overend G Sebastian S Dow JA Davies SA 2014 The Dmelanogaster capa-1 neuropeptide activates renal NF-kB signalingPeptides 53218ndash224
Terhzaz S Teets NM Cabrero P Henderson L Ritchie MG Nachman RJDow JA Denlinger DL Davies SA 2015 Insect capa neuropeptidesimpact desiccation and cold tolerance Proc Natl Acad Sci U S A1122882ndash2887
Thorat LJ Gaikwad SM Nath BB 2012 Trehalose as an indicator ofdesiccation stress in Drosophila melanogaster larvae a potentialmarker of anhydrobiosis Biochem Biophys Res Commun419638ndash642
Tobler R Franssen SU Kofler R Orozco-terWengel P Nolte VHermisson J Schl eurootterer C 2014 Massive habitat-specific ge-nomic response in D melanogaster populations during exper-imental evolution in hot and cold environments Mol Biol Evol31364ndash375
Turner TL Bourne EC Von Wettberg EJ Hu TT Nuzhdin SV 2010Population resequencing reveals local adaptation of Arabidopsislyrata to serpentine soils Nat Genet 42260ndash263
Turner TL Miller PM 2012 Investigating natural variation in Drosophilacourtship song by the evolve and resequence approach Genetics191633ndash642
Vermeulen CJ Soslashrensen P Kirilova Gagalova K Loeschcke V 2013Transcriptomic analysis of inbreeding depression in cold-sensitiveDrosophila melanogaster shows upregulation of the immune re-sponse J Evol Biol 261890ndash1902
Wang B Goode J Best J Meltzer J Schilman PE Chen J Garza D ThomasJB Montminy M 2008 The insulin-regulated CREB coactivatorTORC promotes stress resistance in Drosophila Cell Metab7434ndash444
Wang J Kean L Yang J Allan AK Davies SA Herzyk P Dow JA 2004Function-informed transcriptome analysis of Drosophila renaltubule Genome Biol 5R69
Wang K Li M Hakonarson H 2010 Analysing biological pathways ingenome-wide association studies Nat Rev Genet 11843ndash854
Wang L Jia P Wolfinger RD Chen X Zhao Z 2011 Gene set analysis ofgenome-wide association studies methodological issues and per-spectives Genomics 981ndash8
Weber AL Khan GF Magwire MM Tabor CL Mackay TF Anholt RR2012 Genome-wide association analysis of oxidative stress resistancein Drosophila melanogaster PLoS One 7e34745
Weeks AR McKechnie SW Hoffmann AA 2002 Dissecting adaptiveclinal variation markers inversions and sizestress associations inDrosophila melanogaster from a central field population Ecol Lett5756ndash763
Zhu Y Bergland AO Gonzalez J Petrov DA 2012 Empirical validation ofpooled whole genome population re-sequencing in Drosophila mel-anogaster PLoS One 7e41901
15
In addition to the significant gene-level overlap betweenthe two studies we observed significant overlap between thePPI networks constructed from each set of candidate genes(summarized in fig 2) We examined the overlap by compar-ing the first-order networks 1) at the level of node overlap and2) at the level of edge (interaction) overlap Three hundredseven GWAS seeds formed a network of 3324 nodes whileour previous set of candidates mapped to 337 seeds forming anetwork that comprised 3215 nodes One thousand eighthundred twenty-six or ~55 of the nodes overlapped be-tween networks which was significantly higher than the nulldistribution estimated from 1000 simulations (mean simu-lated overlap SD 1304 143 nodes observed overlap atthe 999th percentile fig 2A and C) The overlap at the inter-action level was not significantly higher than expected (meansimulated overlap SD 25704 4024 edges observed over-lap 30686 edges 89th percentile fig 2B and D) Hive plotswere used to visualize the separate networks (supplementaryfig S4A and B Supplementary Material online) as well as theoverlap between the networks (supplementary fig S4CSupplementary Material online)
Given the high degree of shared nodes from distinct setsof protein seeds we compared the networks for biologicalfunction (fig 2E) Both networks produced long lists of sig-nificantly enriched GO terms (supplementary table S3Supplementary Material online) and assessing overlap was
difficult due to the nonindependent nature of GO hierar-chies However a comparison of functional pathway enrich-ment revealed some differences Although both networksshowed enrichment in the immune system pathways theribosome and Notch and NGF signaling pathways as well assome involvement of gene expression and translation path-ways the network constructed from the Telonis-Scott et al(2012) gene list showed extensive involvement of cell cyclepathway proteins as well as significant enrichment for Wntsignaling EGRF signaling DNA double-strand break repairaxon guidance gap junction trafficking and apoptosis path-ways which were not enriched in the GWAS network TheGWAS network was specifically enriched for dorso-ventralaxis formation oocyte maturation and many more geneexpression pathways than the Telonis-Scott et al (2012)gene network (table 2 and supplementary table S3Supplementary Material online)
Discussion
Genomic Signature of Desiccation Resistance fromNatural Standing Genetic Variation
We report the first large-scale genome-wide screen for nat-ural alleles associated with survival of low humidity condi-tions in D melanogaster By combining a powerful naturalassociation study with high-throughput Pool-seq we
Table 3 Overrepresentation of GO Categories for the 45 Core Candidate Genes Overlapping between the Current and Our Previous(Telonis-Scott et al 2012) Genomic Study
Term ID TermSubterms FDR P valuea Genes
Cellularbehavioral response to stimulus defense and signaling
GO0050896 Response to stimulus (cellular response tostimulus regulation of response tostimulus)
0007 Abd-B dnc fz mam nkd slo ct fru kluRbf klg santa-maria jbug fab1CG33275 Mtl Trim9 cv-c Gprk2 Ect4CG43658
GO0048585 Negative regulation of response to stimu-lus (Rho protein signal transductionsignal transduction)
0042 dnc mam slo fru Trim9 cv-c
GO0023052 Signaling (cAMP-mediated signaling cellcommunication)
0031 Amph Pde9 dnc fz mam nkd ct kluRbf santa-maria fab1 CG33275 MtlTrim9 cv-c Gprk2 Ect4 CG43658
Development
GO0048863 Stem cell differentiation 0014 Abd-D mam nkd klu nub
GO0014016 Neuroblast differentiation 0048
GO0045165 Cell fate commitment 0010 Abd-D dnc fz mam nkd ct klu klg nub
GO0035277 Spiracle morphogenesis open trachealsystem
0010 Abd-D ct cv-c
GO0007423 Sensory organ development (eye wingdisc imaginal disc)
0008 fz mam ct klu klg sdk Amph Nckx30cMtl
GO0048736 Appendage development 0020 fz mam ct fru CG33275 Mtl Fili nubcv-c Gprk2 CG43658
GO0007552 Metamorphosis 0001 fz mam ct fru CG33275 Mtl Fili cv-cGprk2 CG43658
Cellular component organization
GO0030030 Cell projection organization 0020 dnc fz ct fru GluClalpha Nckx30c MtlTrim9 nub cv-c
NOTEmdashNote that most genes are assigned to multiple annotations and therefore are shown purely as a guide to putative functionaBonferronindashHochberg FDR and gene length corrected P values using the Flymine v400 database Gene Ontology Enrichment The terms were condensed and trimmed usingREVIGO
7
mapped SNPs associated with desiccation resistance in 500naturally derived genotypes Although Pool-seq has limitedpower to reveal very low frequency alleles to reconstructphase or adequately estimate allele frequencies in smallpools (Lynch et al 2014) the Pool-GWAS approach hasbeen validated by retrieval of known candidate genes in-volved in body pigmentation using similar numbers of ge-notypes as assessed here (Bastide et al 2013)
Although studies in controlled genetic backgrounds reportlarge effects of single genes on desiccation resistance (Table 1)our current and previous screens revealed similarly largenumbers of variants (over 600) mapping to over 300 genesA polygenic basis of desiccation resistance is consistent withquantitative genetic data for the trait (Foley and Telonis-Scott2011) and we also anticipate more complexity from naturalD melanogaster populations with greater standing geneticvariation than inbred laboratory strains However somefalse positive associations are likely in the GWAS due to thelimitations of quantitative trait mapping approaches Pooledevolve-and-resequence experiments routinely yield vast num-bers of candidate SNPs partly due to high levels of standinglinkage disequilibrium (LD) and hitchhiking neutral alleles
(Schleurootterer et al 2015) Large population sizes from specieswith low levels of LD such as D melanogaster somewhat cir-cumvent this issue and a notable feature of our experimentaldesign is our sampling Here we selected the most desicca-tion-resistant phenotypes directly from substantial standinggenetic variation where one generation of recombinationshould largely reflect natural haplotypes
We observed the largest candidate allele frequency differ-ences in loci that had intermediate allele frequencies in thecontrol pool This is in line with population genetics modelswhich predict that evolution from standing genetic variationwill initially be strongest for intermediate frequency alleleswhich can facilitate rapid change (Long et al 2015) Howeverwe also observed a number of low frequency alleles which ifbeneficial can inflate false positives due to long-range LD(Orozco-terWengel et al 2012 Nuzhdin and Turner 2013Tobler et al 2014) Nonetheless spurious LD associationscannot fully account for our candidate list as evidenced bythe desiccation functional candidates and cross-study over-lap Furthermore chromosome inversion dynamics can alsogenerate both short and long-range LD and contribute nu-merous false positive SNP phenotype associations (Hoffmann
Fig 2 Observed versus simulated network overlap between the GWAS first-order PPI network and the first-order PPI network from the Telonis-Scottet al (2012) gene list Overlap is presented in terms of (A) node number and (B) edge number Networks were simulated by resampling from the entireDrosophila melanogaster gene list (r553) as described in the text The histogram shows the distribution of overlap measures for 1000 simulations Theblack line represents the histogram as density and the blue line shows the corresponding normal distribution The red vertical line shows the observedoverlap from the real networks labeled as a percentile of the normal distribution Area-proportional Venn diagrams summarized the extent of overlapfor PPI networks constructed separately from the GWAS candidates and Telonis-Scott et al (2012) candidates (labeled ldquoprevious studyrdquo) (C) Overlapexpressed as the number of nodes Approximately 55 of the nodes overlapped between the two studies which was significantly higher than simulatedexpectations (999 percentile 2B) (D) Overlap expressed in the number of interactions This was not significantly higher than expected by simulation(89th percentile 2B) (E) Overlap at the level of GO biological process terms (directly compared GO terms by name this was not tested statisticallybecause of the complex hierarchical nature of GO terms and is presented for interest)
8
and Weeks 2007 Tobler et al 2014) but we found little as-sociation between the desiccation pool and inversion fre-quency changes consistent with the lack of latitudinalpatterns in desiccation resistanceand traitinversion fre-quency associations in east Australian D melanogaster(Hoffmann et al 2001 Weeks et al 2002)
Cross-Study Comparison Reveals a Core Set ofCandidates for a Complex Trait
Our data contain many candidate genes functions and path-ways for insect desiccation resistance but partitioning genu-ine adaptive loci from experimental noise remains asignificant challenge We therefore focus our efforts on ournovel comparative analysis where we identified 45 robustcandidates for further exploration Common genetic signa-tures are rarely documented in cross-study comparisons ofcomplex traits (Sarup et al 2011 Huang et al 2012 but cfVermeulen et al 2013) and never before for desiccation re-sistance Some degree of overlap between our studies likelyresults from a shared genetic background from similar sam-pling sites highlighting the replicability of these responsesOne limitation of this approach is the overpopulation ofthe ldquocorerdquo list by longer genes This result reflects the genelength bias inherent to GWAS approaches an issue that is notstraightforward to correct in a GWAS let alone in the com-parative framework applied here A focus on the commongenes functionally linked to the desiccation phenotype canprovide a meaningful framework for future research and alsosuggests that some large-effect candidate genes exhibit natu-ral variation contributing to the phenotype (table 1)
Other feasible candidates include genes expressed inthe MTs (Wang et al 2004) or essential for MT develop-ment (St Pierre et al 2014) providing further researchopportunity focused on the primary stress sensing andfluid secretion tissues Genes include one of the mosthighly expressed MT transcripts CG7084 as well as dncNckx30C slo cv-c and nkd Other aspects of stress resis-tance are also implicated CrebB for example is involvedin regulating insulin-regulated stress resistance and ther-mosensory behavior (Wang et al 2008) nkd also impactslarval cuticle development and shares a PDZ signalingdomain also present in the desiccation candidate desi(Kawano et al 2010 St Pierre et al 2014) Finally severalcore candidates were consistently reported in recent stud-ies suggesting generalized roles in stress responses andfitness including transcriptome studies on MT function(Wang et al 2004) inbreeding and cold resistance(Vermeulen et al 2013) oxidative stress (Weber et al2012) and genomic studies exploring age-specific SNPson fitness traits (Durham et al 2014) and genes likelyunder spatial selection in flies from climatically divergenthabitats (Fabian et al 2012)
Several of the enriched GO categories from the core listalso make biological sense including stimulus response(almost half the suite) defense and signaling particularlyin pathways involved in MT stress sensing (Davies et al 20122013 2014) Functional categories enriched for sensory
organ development were highlighted consistent with envi-ronmental sensing categories from desert Drosophila tran-scriptomics (Matzkin and Markow 2009 Rajpurohit et al2013) and stress detection via phototransduction pathwaysin D melanogaster under a range of stressors including des-iccation (Soslashrensen et al 2007)
Evidence for Conserved Signatures at Higher LevelOrganization
Beyond the gene level we observed a striking degree ofsimilarity between this study and that of Telonis-Scottet al (2012) at the network level where significantly over-lapping protein networks were constructed from largely dif-ferent seed protein sets This supports a scenario where theldquobiological information flow from DNA to phenotyperdquo(Civelek and Lusis 2014) contains inherent redundancywhere alternative genetic solutions underlie phenotypesand functions Resistance to perturbation by genetic or en-vironmental variation appears to increase with increasinghierarchical complexity that is phenotype ldquobufferingrdquo (Fuet al 2009) The same functional network in different pop-ulationsgenetic backgrounds can be impacted but theexact genes and SNPs responding to perturbation will notnecessarily overlap In Drosophila different sets of loci fromthe same founders mapped to the same PPI network in thecase of two complex traits chill coma recovery and startleresponse (Huang et al 2012) Furthermore interspecificshared networkspathways from different genes have beendocumented for numerous complex traits (Emilsson et al2008 Aytes et al 2014 Ichihashi et al 2014) providing op-portunities for investigating taxa where individual genefunction is poorly understood
Although the null distribution is not always obviouswhen comparing networks and their functions partly be-cause of reduced sensitivity from missing interactions(Leiserson et al 2013) here the shared network signaturesportray a plausible multifaceted stress response involvingstress sensing rapid gene accessibility and expression Thepredominance of immune system and stress responsepathways are of interest with reference immunitystresspathway ldquocross-talkrdquo particularly in the MTs whereabiotic stressors elicit strong immunity transcriptionalresponses (Davies et al 2012) The higher order architec-tures reflect crucial aspects of rapid gene expression con-trol in response to stress from chromatin organization(Alexander and Lomvardas 2014) to transcription RNAdecay and translation (reviwed in de Nadal et al 2011) InDrosophila variation in transcriptional regulation of indi-vidual genes can underpin divergent desiccation pheno-types (Chung et al 2014) or be functionally inferred frompatterns of many genes elicited during desiccation stress(Rajpurohit et al 2013) Variants with potential to mod-ulate expression at different stages of gene expressioncontrol were evident in our GWAS data alone for exam-ple enrichment for 50UTR and splice variants at the SNPlevel and for chromatin and chromosome structure termsat the functional level
9
Genetic Overlap Versus Genetic differencesmdashCausesand Implications
Despite the common signatures at the gene and higher orderlevels between our studies pervasive differences were ob-served These can arise from differences in standing geneticvariation in the natural populations where different alleliccompositions impact gene-by-environment interactions andepistasis Our network overlap analyses highlight the potentialfor different molecular trajectories to result in similar func-tions (Leiserson et al 2013) consistent with the multiplephysiological pathways potentially available to D melanoga-ster under water stress (Chown et al 2011) Experimentaldesign also presumably contributed to the study differencesStrong directional selection as applied in the 2012 study canincrease the frequency of rare beneficial alleles and likely re-duced overall diversity than did the single-generation GWASdesign Furthermore selection regimes often impose addi-tional selective pressures (eg food deprivation during desic-cation) resulting in correlated responses such as starvationresistance with desiccation resistance and these factors differfrom the laboratory to the field Finally evidence suggests innatural D melanogaster that standing genetic variation canvary extensively with sampling season (Itoh et al 2010Bergland et al 2014) Although we cannot compare levelsof standing variation between our two studies evidence sug-gests that this may have affected the Dys gene where ourGWAS approach revealed no differentiation between thecontrol and resistant pools at this locus but rather detectedthe alleles associated with the selective sweep in Telonis-Scottet al (2012) in both pools at high frequency Whether thisresulted from sampling the later collection after a longdrought (2004ndash2008 wwwbomgovauclimate last accessedJanuary 13 2016) is unclear but this observation highlightsthe relevance of temporal sampling to standing genetic var-iation when attempting to link complex phenotypes togenotypes
Materials and Methods
Natural Population Sampling
Drosophila melanogaster was collected from vineyard waste atKilchorn Winery Romsey Australia in April 2010 Over 1200isofemale lines were established in the laboratory from wild-caught females Species identification was conducted on maleF1 to remove Drosophila simulans contamination and over900 D melanogaster isofemale lines were retained All flieswere maintained in vials of dextrose cornmeal media at25 C and constant light
Desiccation Assay
From each isofemale line ten inseminated F1 females weresorted by aspiration without CO2 and held in vials until 4ndash5days old For the desiccation assay 10 females per line (over9000 flies in total) were screened by placing groups of femalesinto empty vials covered with gauze and randomly assigninglines into 5 large sealed glass tanks containing trays of silicadesiccant (relative humidity [RH] lt10) The tanks werescored for mortality hourly and the final 558 surviving
individuals were selected for the desiccation-resistant pool(558 flies lt6 of the total) The control pool constitutedthe same number of females sampled from over 200 ran-domly chosen isofemale lines that were frozen prior to thedesiccation assay We chose this random control approachinstead of comparing the phenotypic extremes (ie the latemortality tail vs the ldquoearly mortalityrdquo tail) because unfavor-able environmental conditions experienced by the wild-caught dam may inflate environmental variance due to car-ryover effects in the F1 offspring (Schiffer et al 2013)
DNA Extraction and Sequencing
We used pooled whole-genome sequencing (Pool-seq)(Futschik and Schleurootterer 2010) to estimate and compareallele frequency differences between the most desiccation-resistant and randomly sampled flies from a wild populationto associate naturally segregating candidate SNPs with theresponse to low humidity For each treatment (resistantand control) genomic DNA was extracted from 50 femaleheads using the Qiagen DNeasy Blood and Tissue Kit accord-ing to the manufacturerrsquos instructions (Qiagen HildenGermany) Two extractions were combined to create fivepools per treatment which were barcoded separately tocreate technical replicates (total 10 pools of 100 fliesn frac14 1000) The DNA was fragmented using a CovarisS2 ma-chine (Covaris Inc Woburn MA) and libraries were pre-pared from 1 mg genomic DNA (per pool of 100 flies) usingthe Illumina TruSeq Prep module (Illumina San Diego CA)To minimize the effect of variation across lanes the ten poolswere combined in equal concentration (fig 3) and run mul-tiplexed in each lane of a full flow cell of an Illumina HiSeq2000 sequencer Clusters were generated using the TruSeq PECluster Kit v5 on a cBot and sequenced using Illumina TruSeqSBS v3 chemistry (Illumina) Fragmentation library construc-tion and sequencing were performed at the Micromon Next-Gen Sequencing facility (Monash University ClaytonAustralia)
Fig 3 Experimental design See text for further explanation
10
Data Processing and Mapping
Processing was performed on a high-performance computingcluster using the Rubra pipeline system that makes use of theRuffus Python library (Goodstadt 2010) The final variant-calling pipeline was based on a pipeline developed by ClareSloggett (Sloggett et al 2014) and is publicly available athttpsgithubcomgriffinpGWAS_pipeline (last accessedJanuary 13 2016) Raw sequence reads were processed persample per lane with P1 and P2 read files processed sepa-rately initially Trimming was performed with trimmomatic v030 (Lohse et al 2012) Adapter sequences were also re-moved Leading and trailing bases with a quality scorebelow 30 were trimmed and a sliding window (width frac14 10quality threshold frac14 25) was used Reads shorter than 40 bpafter trimming were discarded Quality before and after trim-ming was examined using FastQC v 0101 (Andrews 2014)
Posttrimming reads that remained paired were used asinput into the alignment step Alignment to the D melano-gaster reference genome version r553 was performed withbwa v 062 (Li and Durbin 2009) using the bwa aln commandand the following options No seeding (-l 150) 1 rate ofmissing alignments given 2 uniform base error rate (-n001) a maximum of 2 gap opens per read (-o 2) a maximumof 12 gap extensions per read (-e 12) and long deletionswithin 12 bp of 30 end disallowed (-d 12) The default valueswere used for all other options Output files were then con-verted to sorted bam files using the bwa sampe commandand the ldquoSortSamrdquo option in Picard v 196
At this point the bam files were merged into one file persample using PicardMerge Duplicates were identified withPicard MarkDuplicates The tool RealignerTargetCreator inGATK v 26-5 (DePristo et al 2011) was applied to the tenbam files simultaneously producing a list of intervals contain-ing probable indels around which local realignment would beperformed This was used as input for the IndelRealigner toolwhich was also applied to all ten bam files simultaneously
To investigate the variation among technical replicatesallele frequency was estimated based on read counts foreach technical replicate separately for 100000 randomly cho-sen SNPs across the genome After excluding candidate des-iccation-resistance SNPs and SNPs that may have been falsepositives due to sequencing error (those with mean frequencylt004 across replicates) 1803 SNPs remained For each locuswe calculated the variance in allele frequency estimate amongthe five control replicates and the variance among the fivedesiccation-resistance replicates We compared this with thevariance due to pool category (control vs desiccation resis-tant) using analysis of variance (ANOVA) We also calculatedthe mean pairwise difference in allele frequency estimateamong control and among desiccation-resistance replicatesand the mean concordance correlation coefficient was calcu-lated over all pairwise comparisons within pool category(using the epiccc function in the epiR package Stevensonet al 2015)
The following approach was then taken to identify SNPsdiffering between the desiccation-resistant and controlgroups Based on the technical replicate analysis we merged
the five ldquodesiccation-resistantrdquo and the five random controlsamples into two bam files A pileup file was created withsamtools v 0119 (Li et al 2009) and converted to a ldquosyncrdquo fileusing the mpileup2sync script in Popoolation2 v 1201 (Kofleret al 2011) retaining reads with a mapping quality of at least20 and bases with a minimum quality score of 20 Reads withunmapped mates were also retained (option ndashA in samtoolsmpileup) For this step we used a repeat-masked version ofthe reference genome created with RepeatMasker 403 (Smitet al 2013-2015) and a repeat file containing all annotatedtransposons from D melanogaster including shared ancestralsequences (Flybase release r557) plus all repetitive elementsfrom RepBase release v1904 for D melanogaster Simple re-peats small RNA genes and bacterial insertions were notmasked (options ndashnolow ndashnorna ndashno_is) The rmblastsearch engine was used
Association Analysis
We then performed Fisherrsquos exact test with the Fisher testscript in Popoolation2 (Kofler et al 2011) We set the min-imum count (of the minor allele) frac14 10 minimum coverage(in both populations) frac14 20 and maximum coverage frac1410000 The test was performed for each SNP (slidingwindow off) Although there are numerous approaches tocalculating genome-wide significance thresholds in aPoolseq framework currently there is no consensus on anappropriate standardized statistical method Methods in-clude simulation approaches (Orozco-terWengel et al2012 Bastide et al 2013 Burke et al 2014) correctionusing a null distribution based on estimating genetic driftfrom replicated artificial selection line allele frequency vari-ance (Turner and Miller 2012) use of a simple thresholdbased on a minimum allele frequency difference (Turneret al 2010) and the FDR correction suggested by Storeyand Tibshirani (2003) (used by Fabian et al 2012)
We determined based on our design to calculate the FDRthreshold by comparison with a simulated distribution of nullP values following the approach of Bastide et al (2013) Foreach SNP locus a P value was calculated by simulating thedistribution of read counts between the major and minorallele according to a beta-binomial distribution with meana(a+b) and variance (ab)((a+b)2(a+b+1)) Thismodel allowed for two types of sampling variationStochastic variation in allele sampling across the phenotypesand the background variation in the representation of allelesin each pool caused by library preparation artifacts or sto-chastic variation driven by unequal coverage of the poolsThese sources of variation have also been recognized else-where as being important in interpreting pooled sequenceallele calls and identifying significant differentiation betweenpooled samples (Lynch et al 2014) The value of afrac14 37 waschosen by chi-square test as the best match to the observedP value distribution (afrac14 10 to afrac14 40 were tested) with acorresponding bfrac14 4315 to reflect the coverage depth differ-ence between the random control (487) and desiccation-resistant (568) pools These values were then used tosimulate a null data set 10 the size of the observed data
11
set Because the null P value distribution still did not fit theobserved P value distribution particularly well (supplemen-tary fig S2 Supplementary Material online) these values werenot used for a standard FDR correction Instead a P valuethreshold was calculated based on the lowest 005 of thenull P values similar in approach to Orozco-terWengel et al(2012)
Inversion Analysis
In D melanogaster several cosmopolitan inversion polymor-phisms show cross-continent latitudinal patterns that explaina substantial proportion of clinal variation for many traitsincluding thermal tolerance (reviewed in Hoffmann andWeeks 2007) Patterns of disequilibria generated betweenloci in the vicinity of the rearrangement can obscure signalsof selection (Hoffmann and Weeks 2007) and inversion fre-quencies are routinely considered in genomic studies ofnatural populations sampled from known latitudinal clines(Fabian et al 2012 Kapun et al 2014) We assessedassociations between inversion frequencies and our controland resistant pools at SNPs diagnostic for the following in-versions In3R(Payne) (Anderson et al 2005 Kapun et al2014) In(2L)t (Andolfatto et al 1999 Kapun et al 2014)In(2R)Ns In(3L)P In(3R)C In(3R)K and In(3R)Mo (Kapunet al 2014) Between one and five diagnostic SNPs were in-vestigated for each inversion (depending on availability) Weconsidered an inversion to be associated with desiccationresistance if the frequency of its diagnostic SNP differed sig-nificantly between the control and desiccation-resistant poolsby a Fisherrsquos exact test
Genome Feature Annotation
The list of candidate SNPs was converted to vcf format using acustom R script setting the alternate allele as the allele thatincreased in frequency from the control to the desiccation-resistant pool and setting the reference allele as the otherallele present in cases where both the major and minor allelesdiffered from the reference genome The genomic features inwhich the candidate SNPs occurred were annotated usingsnpEff v40 first creating a database from the D melanogasterv553 genome annotation with the default approach(Cingolani et al 2012) The default annotation settings wereused except for setting the parameter ldquo-ud 0rdquo to annotateSNPs to genes only if they were within the gene body Using2 tests on SNP counts we tested for over- or under-enrichment of feature types by calculating the proportionof lsquoallrsquo features from all SNPs detected in the study and theproportion of ldquocandidaterdquo SNPs identified as differentiated indesiccation-resistant flies (Fabian et al 2012)
GO Analysis
GO analyses was used to associate the candidates with theirbiological functions (Ashburner et al 2000) Previously devel-oped for biological pathway analysis of global gene expressionstudies (Wang et al 2010) typical GO analysis assumes that allgenes are sampled independently with equal probabilityGWAS violate these criteria where SNPs are more likely to
occur in longer genes than in shorter genes resulting in highertype I error rates in GO categories overpopulated with longgenes Gene length as well as overlapping gene biases (Wanget al 2011) were accounted for using Gowinda (Kofler andSchleurootterer 2012) which implements a permutation strategyto reduce false positives Gowinda does not reconstruct LDbetween SNPs but operates in two modes underpinned bytwo extreme assumptions 1) Gene mode (assumes all SNPs ina gene are in LD) and 2) SNP mode (assumes all SNPs in agene are completely independent)
The candidates were examined in both modes but giventhe stringency of gene mode high-resolution SNP modeoutput was utilized for the final analysis with the followingparameters 106 simulations minimum significance frac141 andminimum genes per GO categoryfrac14 3 (to exclude gene-poorcategories) P values were corrected using the Gowinda FDRalgorithm GO annotations from Flybase r557 were applied
Candidate Gene Analysis
In conjunction with the GO analysis we also examinedknown candidate genes associated with survival under lowRH to better characterize possible biological functions of ourGWAS candidates We took a conservative approach to can-didate gene assignment where the SNPs were mapped withinthe entire gene region without up- and downstream exten-sions Genes were curated from FlyBase and the literature(table 1) predominantly using detailed functional experimen-tal evidence as criteria for functional desiccation candidates
Network Analysis
We next obtained a higher level summary of the candidategene list using PPI network analysis The full candidate genelist was used to construct a first-order interaction networkusing all Drosophila PPIs listed in the Drosophila InteractionDatabase v 2014_10 (avoiding interologs) (Murali et al 2011)This ldquofullrdquo PPI network includes manually curated data frompublished literature and experimental data for 9633 proteinsand 93799 interactions For the network construction theigraph R package was used to build a subgraph from the fullPPI network containing the candidate proteins and their first-order interacting proteins (Csardi and Nepusz 2006) Self-connections were removed Functional enrichment analysiswas conducted on the list of network genes using FlyMinev400 (wwwflymineorg last accessed January 13 2016) withthe following parameters for GO enrichment and KEGGReactome pathway enrichment BenjaminindashHochberg FDRcorrection P lt 005 background frac14 all D melanogastergenes in our full PPI network and Ontology (for GO enrich-ment only) frac14 biological process or molecular function
Cross-Study Comparison Overlap with Our PreviousWork
Following on from the comparison between our gene list andthe candidate genes suggested from the literature we quan-titatively evaluated the degree of overlap between the candi-date genes identified in this study with alleles mapped in ourprevious study of flies collected from the same geographic
12
region (Telonis-Scott et al 2012) The ldquooverlaprdquo analysis wasperformed similarly for the GWAS candidate gene and net-work-level approaches First the gene list overlap was testedbetween the two studies using Fisherrsquos exact test where the382 genes identified from the candidate SNPs (this study)were compared with 416 genes identified from candidatesingle feature polymorphisms (Telonis-Scott et al 2012)The contingency table was constructed using the numberof annotated genes in the D melanogaster R553 genomebuilt as an approximation for the total number of genestested in each study and took the following form
where A frac14 gene list from Telonis-Scott et al (2012)n frac14 416 B frac14 gene list from this study n frac14 382 N frac14 totalnumber of genes in D melanogaster reference genome r553n frac14 17106
To annotate genome features in the overlap candidategene list we used the SNP data (this study) which betterresolves alleles than array-based genotyping Due to this lim-itation we did not compare exact genomic coordinates be-tween the studies but considered overlap to occur whendifferentiated alleles were detected in the same gene inboth studies We did however examine the allele frequenciesaround the Dys-RF promoter which was sequenced sepa-rately revealing a selective sweep in the experimental evolu-tion lines (Telonis-Scott et al 2012)
The overlapping candidates were tested for functional en-richment using a basic gene list approach As genes ratherthan SNPs were analyzed here Fly Mine v400 (wwwflymineorg) was applied with the following parameters Gene lengthnormalization BenjaminindashHochberg FDR correction Plt 005Ontology frac14 biological process and default background (allGO-annotated D melanogaster genes)
Finally we constructed a second PPI network from thecandidate genes mapped in Telonis-Scott et al (2012) to ex-amine system-level overlap between the two studies The 416genes mapped to 337 protein seeds and the network wasconstructed as described above The degree of network over-lap between the two studies was tested using simulation Ineach iteration a gene set of the same length as the list fromthis study and a gene set of the same length as the list fromTelonis-Scott et al (2012) were resampled randomly from theentire D melanogaster mapped gene annotation (r553) Eachresampled set was then used to build a first-order network aspreviously described The overlap network between the tworesampled sets was calculated using the graphintersectioncommand from the igraph package and zero-degree nodes(proteins with no interactions) were removed Overlap wasquantified by counting the number of nodes and the numberof edges in this overlap network The observed overlap wascompared with a simulated null distribution for both nodeand edge number where n frac14 1000 simulation iterations
Supplementary MaterialSupplementary figures S1ndashS4 and tables S1ndashS5 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
We thank Robert Good for the fly sampling and JenniferShirriffs and Lea and Alan Rako for outstanding assistancewith laboratory culture and substantial desiccation assaysWe are grateful to Melissa Davis for advice on network build-ing and comparison approaches and to five anonymous re-viewers whose comments improved the manuscript Thiswork was supported by a DECRA fellowship DE120102575and Monash University Technology voucher scheme toMTS an ARC Future Fellowship and Discovery Grants toCMS an ARC Laureate Fellowship to AAH and a Scienceand Industry Endowment Fund grant to CMS and AAH
ReferencesAlexander JM Lomvardas S 2014 Nuclear architecture as an epigenetic
regulator of neural development and function Neuroscience26439ndash50
Anderson AR Hoffmann AA McKechnie SW Umina PA Weeks AR2005 The latitudinal cline in the In(3R)Payne inversion polymor-phism has shifted in the last 20 years in Australian Drosophilamelanogaster populations Mol Ecol 14851ndash858
Andolfatto P Wall JD Kreitman M 1999 Unusual haplotype structureat the proximal breakpoint of In(2L)t in a natural population ofDrosophila melanogaster Genetics 1531297ndash1311
Andrews S 2014 FastQC Available from httpwwwbioinformaticsbbsrcacukprojectsfastqc
Ashburner M Ball CA Blake JA Botstein D Butler H Cherry JM DavisAP Dolinski K Dwight SS Eppig JT et al 2000 Gene ontology toolfor the unification of biology The Gene Ontology Consortium NatGenet 2525ndash29
Aytes A Mitrofanova A Lefebvre C Alvarez Mariano J Castillo-MartinM Zheng T Eastham James A Gopalan A Pienta Kenneth J ShenMM et al 2014 Cross-species regulatory network analysis identifiesa synergistic interaction between FOXM1 and CENPF that drivesprostate cancer malignancy Cancer Cell 25638ndash651
Bastide H Betancourt A Nolte V Tobler R Steuroobe P Futschik ASchleurootterer C 2013 A genome-wide fine-scale map of natural pig-mentation variation in Drosophila melanogaster PLoS Genet9e1003534
Bergland AO Behrman EL OrsquoBrien KR Schmidt PS Petrov DA 2014Genomic evidence of rapid and stable adaptive oscillations overseasonal time scales in Drosophila PLoS Genet 10e1004775
Bonebrake TC Mastrandrea MD 2010 Tolerance adaptation and pre-cipitation changes complicate latitudinal patterns of climate changeimpacts Proc Natl Acad Sci U S A 10712581ndash12586
Bradley TJ 2009 Animal osmoregulation Oxford Oxford UniversityPress
Burke MK Liti G Long AD 2014 Standing genetic variation drivesrepeatable experimental evolution in outcrossing populations ofSaccharomyces cerevisiae Mol Biol Evol 313228ndash3239
Chown SL 2012 Trait-based approaches to conservation physiologyforecasting environmental change risks from the bottom upPhilos Trans R Soc B 3671615ndash1627
Chown SL Nicolson SW 2004 Insect physiological ecology mechanismsand patterns Oxford Oxford University Press
Chown SL Soslashrensen JG Terblanche JS 2011 Water loss in insectsan environmental change perspective J Insect Physiol571070ndash1084
Not in GWAS In GWAS
Not in Telonis-Scott et al (2012) N ndash intersect-AndashB A-intersect
In Telonis-Scott et al (2012) B-intersect Intersect(AB)
13
Chung H Loehlin DW Dufour HD Vaccarro K Millar JG Carroll SB2014 A single gene affects both ecological divergence and matechoice in Drosophila Science 3431148ndash1151
Cingolani P Platts A Wang LL Coon M Nguyen T Wang L Land SJ LuX Ruden DM 2012 A program for annotating and predicting theeffects of single nucleotide polymorphisms SnpEff SNPs in thegenome of Drosophila melanogaster strain w1118 iso-2 iso-3 Fly(Austin) 680ndash92
Civelek M Lusis AJ 2014 Systems genetics approaches to understandcomplex traits Nat Rev Genet 1534ndash48
Clusella-Trullas S Blackburn TM Chown SL 2011 Climatic predictors oftemperature performance curve parameters in ectotherms implycomplex responses to climate change Am Nat 177738ndash751
Csardi G Nepusz T 2006 The igraph software package for complexnetwork research InterJournal Complex Systems 1695 Availablefrom httpigraphorg
Davies SA Cabrero P Overend G Aitchison L Sebastian S Terhzaz SDow JA 2014 Cell signalling mechanisms for insect stress toleranceJ Exp Biol 217119ndash128
Davies SA Cabrero P Povsic M Johnston NR Terhzaz S Dow JA 2013Signaling by Drosophila capa neuropeptides Gen Comp Endocrinol18860ndash66
Davies SA Overend G Sebastian S Cundall M Cabrero P Dow JATerhzaz S 2012 Immune and stress response lsquocross-talkrsquo in theDrosophila Malpighian tubule J Insect Physiol 58488ndash497
Day JP Dow JA Houslay MD Davies SA 2005 Cyclic nucleotide phos-phodiesterases in Drosophila melanogaster Biochem J 388333ndash342
de Nadal E Ammerer G Posas F 2011 Controlling gene expression inresponse to stress Nat Rev Genet 12833ndash845
DePristo MA Banks E Poplin R Garimella KV Maguire JR Hartl CPhilippakis AA del Angel G Rivas MA Hanna M et al 2011 Aframework for variation discovery and genotyping using next-gen-eration DNA sequencing data Nat Genet 43491ndash498
Djawdan M Chippindale AK Rose MR Bradley TJ 1998 Metabolic re-serves and evolved stress resistance in Drosophila melanogasterPhysiol Zool 71584ndash594
Durham MF Magwire MM Stone EA Leips J 2014 Genome-wide anal-ysis in Drosophila reveals age-specific effects of SNPs on fitness traitsNat Commun 54338
Emilsson V Thorleifsson G Zhang B Leonardson AS Zink F Zhu JCarlson S Helgason A Walters GB Gunnarsdottir S et al 2008Genetics of gene expression and its effect on disease Nature452423ndash428
Fabian DK Kapun M Nolte V Kofler R Schmidt PS Schleurootterer C FlattT 2012 Genome-wide patterns of latitudinal differentiation amongpopulations of Drosophila melanogaster from North America MolEcol 214748ndash4769
Foley BR Telonis-Scott M 2011 Quantitative genetic analysis suggestscausal association between cuticular hydrocarbon composition anddesiccation survival in Drosophila melanogaster Heredity 10668ndash77
Folk DG Han C Bradley TJ 2001 Water acquisition and partitioning inDrosophila melanogaster effects of selection for desiccation-resis-tance J Exp Biol 2043323ndash3331
Fowler MA Montell C 2013 Drosophila TRP channels and animal be-havior Life Sci 92394ndash403
Fu J Keurentjes JJB Bouwmeester H America T Verstappen FWA WardJL Beale MH de Vos RCH Dijkstra M Scheltema RA et al 2009System-wide molecular evidence for phenotypic buffering inArabidopsis Nat Genet 41166ndash167
Futschik A Schleurootterer C 2010 The next generation of molecular mar-kers from massively parallel sequencing of pooled DNA samplesGenetics 186207ndash218
Gibbs AG Fukuzato F Matzkin LM 2003 Evolution of water conserva-tion mechanisms in Drosophila J Exp Biol 2061183ndash1192
Gibbs AG Matzkin LM 2001 Evolution of water balance in the genusDrosophila J Exp Biol 2042331ndash2338
Goodstadt L 2010 Ruffus a lightweight Python library for computa-tional pipelines Bioinformatics 262778ndash2779
Hadley NF 1994 Water relations of terrestrial arthropods San Diego(CA) Academic Press
Hoffmann AA Hallas R Sinclair C Mitrovski P 2001 Levels of variationin stress resistance in Drosophila among strains local populationsand geographic regions patterns for desiccation starvation coldresistance and associated traits Evolution 551621ndash1630
Hoffmann AA Parsons PA 1989a An integrated approach to environ-mental stress tolerance and life-history variation desiccation toler-ance in Drosophila Biol J Linn Soc 37117ndash136
Hoffmann AA Parsons PA 1989b Selection for increased desiccationresistance in Drosophila melanogaster additive genetic control andcorrelated responses for other stresses Genetics 122837ndash845
Hoffmann AA Sgro CM 2011 Climate change and evolutionary adap-tation Nature 470479ndash485
Hoffmann AA Weeks AR 2007 Climatic selection on genes and traitsafter a 100 year-old invasion a critical look at the temperate-tropicalclines in Drosophila melanogaster from eastern Australia Genetica129133ndash147
Huang W Richards S Carbone MA Zhu D Anholt RR Ayroles JFDuncan L Jordan KW Lawrence F Magwire MM et al 2012Epistasis dominates the genetic architecture of Drosophila quanti-tative traits Proc Natl Acad Sci U S A 10915553ndash15559
Huang Z Tunnacliffe A 2004 Response of human cells to desiccationcomparison with hyperosmotic stress response J Physiol558181ndash191
Huang Z Tunnacliffe A 2005 Gene induction by desiccation stress inhuman cell cultures FEBS Lett 5794973ndash4977
Ichihashi Y Aguilar-Martınez JA Farhi M Chitwood DH Kumar RMillon LV Peng J Maloof JN Sinha NR 2014 Evolutionary develop-mental transcriptomics reveals a gene network module regulatinginterspecific diversity in plant leaf shape Proc Natl Acad Sci U S A111E2616ndashE2621
Itoh M Nanba N Hasegawa M Inomata N Kondo R Oshima MTakano-Shimizu T 2010 Seasonal changes in the long-distance link-age disequilibrium in Drosophila melanogaster J Hered 10126ndash32
Johnson WA Carder JW 2012 Drosophila nociceptors mediate larvalaversion to dry surface environments utilizing both the painless TRPchannel and the DEGENaC subunit PPK1 PLoS One 7e32878
Kahsai L Kapan N Dircksen H Winther AM Nassel DR 2010 Metabolicstress responses in Drosophila are modulated by brain neurosecre-tory cells that produce multiple neuropeptides PLoS One 5e11480
Kapun M van Schalkwyk H McAllister B Flatt T Schleurootterer C 2014Inference of chromosomal inversion dynamics from Pool-Seq datain natural and laboratory populations of Drosophila melanogasterMol Ecol 231813ndash1827
Kawano T Shimoda M Matsumoto H Ryuda M Tsuzuki S Hayakawa Y2010 Identification of a gene Desiccate contributing to desicca-tion resistance in Drosophila melanogaster J Biol Chem28538889ndash38897
Kellermann V van Heerwaarden B Sgro CM Hoffmann AA 2009Fundamental evolutionary limits in ecological traits driveDrosophila species distributions Science 3251244ndash1246
Kofler R Nolte V Schleurootterer C 2016 The impact of library preparationprotocols on the consistency of allele frequency estimates in Pool-Seq data Mol Ecol Resour 16118ndash122
Kofler R Pandey RV Schleurootterer C 2011 PoPoolation2 identifying dif-ferentiation between populations using sequencing of pooled DNAsamples (Pool-Seq) Bioinformatics 273435ndash3436
Kofler R Schleurootterer C 2012 Gowinda unbiased analysis of gene setenrichment for genome-wide association studies Bioinformatics282084ndash2085
Leiserson MDM Eldridge JV Ramachandran S Raphael BJ 2013Network analysis of GWAS data Curr Opin Genet Dev 23602ndash610
Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 251754ndash1760
Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079
14
Liu Y Luo J Carlsson MA Neuroassel DR 2015 Serotonin and insulin-likepeptides modulate leucokinin-producing neurons that affectfeeding and water homeostasis in Drosophila J Comp Neurol5231840ndash1863
Lohse M Bolger AM Nagel A Fernie AR Lunn JE Stitt M Usadel B 2012RobiNA a user-friendly integrated software solution for RNA-Seq-based transcriptomics Nucleic Acids Res 40W622ndashW627
Long A Liti G Luptak A Tenaillon O 2015 Elucidating the moleculararchitecture of adaptation via evolve and resequence experimentsNat Rev Genet 16567ndash582
Lynch M Bost D Wilson S Maruki T Harrison S 2014 Population-genetic inference from pooled-sequencing data Genome Biol Evol61210ndash1218
Mackay TFC Stone EA Ayroles JF 2009 The genetics of quantitativetraits challenges and prospects Nat Rev Genet 10565ndash577
Marron MT Markow TA Kain KJ Gibbs AG 2003 Effects of starvationand desiccation on energy metabolism in desert and mesicDrosophila J Insect Physiol 49261ndash270
Matzkin LM Markow TA 2009 Transcriptional regulation of metabo-lism associated with the increased desiccation resistance of thecactophilic Drosophila mojavensis Genetics 1821279ndash1288
Murali T Pacifico S Yu J Guest S Roberts GG Finley RL Jr 2011 DroID2011 a comprehensive integrated resource for protein transcrip-tion factor RNA and gene interactions for Drosophila Nucleic AcidsRes 39D736ndashD743
Nuzhdin SV Turner TL 2013 Promises and limitations of hitchhikingmapping Curr Opin Genet Dev 23694ndash699
Orozco-terWengel P Kapun M Nolte V Kofler R Flatt T Schleurootterer C2012 Adaptation of Drosophila to a novel laboratory environmentreveals temporally heterogeneous trajectories of selected alleles MolEcol 214931ndash4941
Picard Picard A set of Java command line tools for manipulating high-throughput seqeuncing (HTS) data and formats Available fromhttpbroadinstitutegithubiopicard
Qiu Y Tittiger C Wicker-Thomas C Le Goff G Young S Wajnberg EFricaux T Taquet N Blomquist GJ Feyereisen R 2012 An insect-specific P450 oxidative decarbonylase for cuticular hydrocarbon bio-synthesis Proc Natl Acad Sci U S A 10914858ndash14863
Rajpurohit S Oliveira CC Etges WJ Gibbs AG 2013 Functional genomicand phenotypic responses to desiccation in natural populations of adesert Drosophilid Mol Ecol 222698ndash2715
Sarup P Soslashrensen JG Kristensen TN Hoffmann AA Loeschcke V PaigeKN Soslashrensen P 2011 Candidate genes detected in transcriptomestudies are strongly dependent on genetic background PLoS One6e15644
Schiffer M Hangartner S Hoffmann AA 2013 Assessing the relativeimportance of environmental effects carry-over effects and speciesdifferences in thermal stress resistance a comparison ofDrosophilids across field and laboratory generations J Exp Biol2163790ndash3798
Schleurootterer C Kofler R Versace E Tobler R Franssen SU 2015Combining experimental evolution with next-generation sequen-cing A powerful tool to study adaptation from standing geneticvariation Heredity 114431ndash440
Sinclair BJ Gibbs AG Roberts SP 2007 Gene transcription during ex-posure to and recovery from cold and desiccation stress inDrosophila melanogaster Insect Mol Biol 16435ndash443
Sloggett C Wakefield M Philip G Pope B 2014 Rubramdashflexible distrib-uted pipelines figshare Available from httpdxdoiorg106084m9figshare895626
Smit AFA Hubley R Green P 2013-2015 RepeatMasker Open-403Available from httpwwwrepeatmaskerorg
Seurooderberg JA Birse RT Neuroassel DR 2011 Insulin production and signalingin renal tubules of Drosophila is under control of tachykinin-relatedpeptide and regulates stress resistance PLoS One 6e19866
Soslashrensen JG Nielsen MM Loeschcke V 2007 Gene expression profileanalysis of Drosophila melanogaster selected for resistance to envi-ronmental stressors J Evol Biol 201624ndash1636
St Pierre SE Ponting L Stefancsik R McQuilton P 2014 FlyBase102mdashadvanced approaches to interrogating FlyBase Nucleic AcidsRes 42D780ndashD788
Stevenson M Nunes T Heuser C Marshall J Sanchez J Thornton RReiczigel J Robison-Cox J Sebastiani P Solymos P et al 2015 epiRTools for the analysis of epidemiological data R package version09-62 Available from httpCRANR-projectorgpackagefrac14epiR
Storey JD Tibshirani R 2003 Statistical significance for genomewidestudies Proc Natl Acad Sci U S A 1009440ndash9445
Telonis-Scott M Gane M DeGaris S Sgro CM Hoffmann AA 2012 Highresolution mapping of candidate alleles for desiccation resistancein Drosophila melanogaster under selection Mol Biol Evol291335ndash1351
Telonis-Scott M Guthridge KM Hoffmann AA 2006 A new set oflaboratory-selected Drosophila melanogaster lines for the analysisof desiccation resistance response to selection physiology and cor-related responses J Exp Biol 2091837ndash1847
Terhzaz S Cabrero P Robben JH Radford JC Hudson BD Milligan GDow JA Davies SA 2012 Mechanism and function of Drosophilacapa GPCR a desiccation stress-responsive receptor with functionalhomology to human neuromedinU receptor PLoS One 7e29897
Terhzaz S Overend G Sebastian S Dow JA Davies SA 2014 The Dmelanogaster capa-1 neuropeptide activates renal NF-kB signalingPeptides 53218ndash224
Terhzaz S Teets NM Cabrero P Henderson L Ritchie MG Nachman RJDow JA Denlinger DL Davies SA 2015 Insect capa neuropeptidesimpact desiccation and cold tolerance Proc Natl Acad Sci U S A1122882ndash2887
Thorat LJ Gaikwad SM Nath BB 2012 Trehalose as an indicator ofdesiccation stress in Drosophila melanogaster larvae a potentialmarker of anhydrobiosis Biochem Biophys Res Commun419638ndash642
Tobler R Franssen SU Kofler R Orozco-terWengel P Nolte VHermisson J Schl eurootterer C 2014 Massive habitat-specific ge-nomic response in D melanogaster populations during exper-imental evolution in hot and cold environments Mol Biol Evol31364ndash375
Turner TL Bourne EC Von Wettberg EJ Hu TT Nuzhdin SV 2010Population resequencing reveals local adaptation of Arabidopsislyrata to serpentine soils Nat Genet 42260ndash263
Turner TL Miller PM 2012 Investigating natural variation in Drosophilacourtship song by the evolve and resequence approach Genetics191633ndash642
Vermeulen CJ Soslashrensen P Kirilova Gagalova K Loeschcke V 2013Transcriptomic analysis of inbreeding depression in cold-sensitiveDrosophila melanogaster shows upregulation of the immune re-sponse J Evol Biol 261890ndash1902
Wang B Goode J Best J Meltzer J Schilman PE Chen J Garza D ThomasJB Montminy M 2008 The insulin-regulated CREB coactivatorTORC promotes stress resistance in Drosophila Cell Metab7434ndash444
Wang J Kean L Yang J Allan AK Davies SA Herzyk P Dow JA 2004Function-informed transcriptome analysis of Drosophila renaltubule Genome Biol 5R69
Wang K Li M Hakonarson H 2010 Analysing biological pathways ingenome-wide association studies Nat Rev Genet 11843ndash854
Wang L Jia P Wolfinger RD Chen X Zhao Z 2011 Gene set analysis ofgenome-wide association studies methodological issues and per-spectives Genomics 981ndash8
Weber AL Khan GF Magwire MM Tabor CL Mackay TF Anholt RR2012 Genome-wide association analysis of oxidative stress resistancein Drosophila melanogaster PLoS One 7e34745
Weeks AR McKechnie SW Hoffmann AA 2002 Dissecting adaptiveclinal variation markers inversions and sizestress associations inDrosophila melanogaster from a central field population Ecol Lett5756ndash763
Zhu Y Bergland AO Gonzalez J Petrov DA 2012 Empirical validation ofpooled whole genome population re-sequencing in Drosophila mel-anogaster PLoS One 7e41901
15
mapped SNPs associated with desiccation resistance in 500naturally derived genotypes Although Pool-seq has limitedpower to reveal very low frequency alleles to reconstructphase or adequately estimate allele frequencies in smallpools (Lynch et al 2014) the Pool-GWAS approach hasbeen validated by retrieval of known candidate genes in-volved in body pigmentation using similar numbers of ge-notypes as assessed here (Bastide et al 2013)
Although studies in controlled genetic backgrounds reportlarge effects of single genes on desiccation resistance (Table 1)our current and previous screens revealed similarly largenumbers of variants (over 600) mapping to over 300 genesA polygenic basis of desiccation resistance is consistent withquantitative genetic data for the trait (Foley and Telonis-Scott2011) and we also anticipate more complexity from naturalD melanogaster populations with greater standing geneticvariation than inbred laboratory strains However somefalse positive associations are likely in the GWAS due to thelimitations of quantitative trait mapping approaches Pooledevolve-and-resequence experiments routinely yield vast num-bers of candidate SNPs partly due to high levels of standinglinkage disequilibrium (LD) and hitchhiking neutral alleles
(Schleurootterer et al 2015) Large population sizes from specieswith low levels of LD such as D melanogaster somewhat cir-cumvent this issue and a notable feature of our experimentaldesign is our sampling Here we selected the most desicca-tion-resistant phenotypes directly from substantial standinggenetic variation where one generation of recombinationshould largely reflect natural haplotypes
We observed the largest candidate allele frequency differ-ences in loci that had intermediate allele frequencies in thecontrol pool This is in line with population genetics modelswhich predict that evolution from standing genetic variationwill initially be strongest for intermediate frequency alleleswhich can facilitate rapid change (Long et al 2015) Howeverwe also observed a number of low frequency alleles which ifbeneficial can inflate false positives due to long-range LD(Orozco-terWengel et al 2012 Nuzhdin and Turner 2013Tobler et al 2014) Nonetheless spurious LD associationscannot fully account for our candidate list as evidenced bythe desiccation functional candidates and cross-study over-lap Furthermore chromosome inversion dynamics can alsogenerate both short and long-range LD and contribute nu-merous false positive SNP phenotype associations (Hoffmann
Fig 2 Observed versus simulated network overlap between the GWAS first-order PPI network and the first-order PPI network from the Telonis-Scottet al (2012) gene list Overlap is presented in terms of (A) node number and (B) edge number Networks were simulated by resampling from the entireDrosophila melanogaster gene list (r553) as described in the text The histogram shows the distribution of overlap measures for 1000 simulations Theblack line represents the histogram as density and the blue line shows the corresponding normal distribution The red vertical line shows the observedoverlap from the real networks labeled as a percentile of the normal distribution Area-proportional Venn diagrams summarized the extent of overlapfor PPI networks constructed separately from the GWAS candidates and Telonis-Scott et al (2012) candidates (labeled ldquoprevious studyrdquo) (C) Overlapexpressed as the number of nodes Approximately 55 of the nodes overlapped between the two studies which was significantly higher than simulatedexpectations (999 percentile 2B) (D) Overlap expressed in the number of interactions This was not significantly higher than expected by simulation(89th percentile 2B) (E) Overlap at the level of GO biological process terms (directly compared GO terms by name this was not tested statisticallybecause of the complex hierarchical nature of GO terms and is presented for interest)
8
and Weeks 2007 Tobler et al 2014) but we found little as-sociation between the desiccation pool and inversion fre-quency changes consistent with the lack of latitudinalpatterns in desiccation resistanceand traitinversion fre-quency associations in east Australian D melanogaster(Hoffmann et al 2001 Weeks et al 2002)
Cross-Study Comparison Reveals a Core Set ofCandidates for a Complex Trait
Our data contain many candidate genes functions and path-ways for insect desiccation resistance but partitioning genu-ine adaptive loci from experimental noise remains asignificant challenge We therefore focus our efforts on ournovel comparative analysis where we identified 45 robustcandidates for further exploration Common genetic signa-tures are rarely documented in cross-study comparisons ofcomplex traits (Sarup et al 2011 Huang et al 2012 but cfVermeulen et al 2013) and never before for desiccation re-sistance Some degree of overlap between our studies likelyresults from a shared genetic background from similar sam-pling sites highlighting the replicability of these responsesOne limitation of this approach is the overpopulation ofthe ldquocorerdquo list by longer genes This result reflects the genelength bias inherent to GWAS approaches an issue that is notstraightforward to correct in a GWAS let alone in the com-parative framework applied here A focus on the commongenes functionally linked to the desiccation phenotype canprovide a meaningful framework for future research and alsosuggests that some large-effect candidate genes exhibit natu-ral variation contributing to the phenotype (table 1)
Other feasible candidates include genes expressed inthe MTs (Wang et al 2004) or essential for MT develop-ment (St Pierre et al 2014) providing further researchopportunity focused on the primary stress sensing andfluid secretion tissues Genes include one of the mosthighly expressed MT transcripts CG7084 as well as dncNckx30C slo cv-c and nkd Other aspects of stress resis-tance are also implicated CrebB for example is involvedin regulating insulin-regulated stress resistance and ther-mosensory behavior (Wang et al 2008) nkd also impactslarval cuticle development and shares a PDZ signalingdomain also present in the desiccation candidate desi(Kawano et al 2010 St Pierre et al 2014) Finally severalcore candidates were consistently reported in recent stud-ies suggesting generalized roles in stress responses andfitness including transcriptome studies on MT function(Wang et al 2004) inbreeding and cold resistance(Vermeulen et al 2013) oxidative stress (Weber et al2012) and genomic studies exploring age-specific SNPson fitness traits (Durham et al 2014) and genes likelyunder spatial selection in flies from climatically divergenthabitats (Fabian et al 2012)
Several of the enriched GO categories from the core listalso make biological sense including stimulus response(almost half the suite) defense and signaling particularlyin pathways involved in MT stress sensing (Davies et al 20122013 2014) Functional categories enriched for sensory
organ development were highlighted consistent with envi-ronmental sensing categories from desert Drosophila tran-scriptomics (Matzkin and Markow 2009 Rajpurohit et al2013) and stress detection via phototransduction pathwaysin D melanogaster under a range of stressors including des-iccation (Soslashrensen et al 2007)
Evidence for Conserved Signatures at Higher LevelOrganization
Beyond the gene level we observed a striking degree ofsimilarity between this study and that of Telonis-Scottet al (2012) at the network level where significantly over-lapping protein networks were constructed from largely dif-ferent seed protein sets This supports a scenario where theldquobiological information flow from DNA to phenotyperdquo(Civelek and Lusis 2014) contains inherent redundancywhere alternative genetic solutions underlie phenotypesand functions Resistance to perturbation by genetic or en-vironmental variation appears to increase with increasinghierarchical complexity that is phenotype ldquobufferingrdquo (Fuet al 2009) The same functional network in different pop-ulationsgenetic backgrounds can be impacted but theexact genes and SNPs responding to perturbation will notnecessarily overlap In Drosophila different sets of loci fromthe same founders mapped to the same PPI network in thecase of two complex traits chill coma recovery and startleresponse (Huang et al 2012) Furthermore interspecificshared networkspathways from different genes have beendocumented for numerous complex traits (Emilsson et al2008 Aytes et al 2014 Ichihashi et al 2014) providing op-portunities for investigating taxa where individual genefunction is poorly understood
Although the null distribution is not always obviouswhen comparing networks and their functions partly be-cause of reduced sensitivity from missing interactions(Leiserson et al 2013) here the shared network signaturesportray a plausible multifaceted stress response involvingstress sensing rapid gene accessibility and expression Thepredominance of immune system and stress responsepathways are of interest with reference immunitystresspathway ldquocross-talkrdquo particularly in the MTs whereabiotic stressors elicit strong immunity transcriptionalresponses (Davies et al 2012) The higher order architec-tures reflect crucial aspects of rapid gene expression con-trol in response to stress from chromatin organization(Alexander and Lomvardas 2014) to transcription RNAdecay and translation (reviwed in de Nadal et al 2011) InDrosophila variation in transcriptional regulation of indi-vidual genes can underpin divergent desiccation pheno-types (Chung et al 2014) or be functionally inferred frompatterns of many genes elicited during desiccation stress(Rajpurohit et al 2013) Variants with potential to mod-ulate expression at different stages of gene expressioncontrol were evident in our GWAS data alone for exam-ple enrichment for 50UTR and splice variants at the SNPlevel and for chromatin and chromosome structure termsat the functional level
9
Genetic Overlap Versus Genetic differencesmdashCausesand Implications
Despite the common signatures at the gene and higher orderlevels between our studies pervasive differences were ob-served These can arise from differences in standing geneticvariation in the natural populations where different alleliccompositions impact gene-by-environment interactions andepistasis Our network overlap analyses highlight the potentialfor different molecular trajectories to result in similar func-tions (Leiserson et al 2013) consistent with the multiplephysiological pathways potentially available to D melanoga-ster under water stress (Chown et al 2011) Experimentaldesign also presumably contributed to the study differencesStrong directional selection as applied in the 2012 study canincrease the frequency of rare beneficial alleles and likely re-duced overall diversity than did the single-generation GWASdesign Furthermore selection regimes often impose addi-tional selective pressures (eg food deprivation during desic-cation) resulting in correlated responses such as starvationresistance with desiccation resistance and these factors differfrom the laboratory to the field Finally evidence suggests innatural D melanogaster that standing genetic variation canvary extensively with sampling season (Itoh et al 2010Bergland et al 2014) Although we cannot compare levelsof standing variation between our two studies evidence sug-gests that this may have affected the Dys gene where ourGWAS approach revealed no differentiation between thecontrol and resistant pools at this locus but rather detectedthe alleles associated with the selective sweep in Telonis-Scottet al (2012) in both pools at high frequency Whether thisresulted from sampling the later collection after a longdrought (2004ndash2008 wwwbomgovauclimate last accessedJanuary 13 2016) is unclear but this observation highlightsthe relevance of temporal sampling to standing genetic var-iation when attempting to link complex phenotypes togenotypes
Materials and Methods
Natural Population Sampling
Drosophila melanogaster was collected from vineyard waste atKilchorn Winery Romsey Australia in April 2010 Over 1200isofemale lines were established in the laboratory from wild-caught females Species identification was conducted on maleF1 to remove Drosophila simulans contamination and over900 D melanogaster isofemale lines were retained All flieswere maintained in vials of dextrose cornmeal media at25 C and constant light
Desiccation Assay
From each isofemale line ten inseminated F1 females weresorted by aspiration without CO2 and held in vials until 4ndash5days old For the desiccation assay 10 females per line (over9000 flies in total) were screened by placing groups of femalesinto empty vials covered with gauze and randomly assigninglines into 5 large sealed glass tanks containing trays of silicadesiccant (relative humidity [RH] lt10) The tanks werescored for mortality hourly and the final 558 surviving
individuals were selected for the desiccation-resistant pool(558 flies lt6 of the total) The control pool constitutedthe same number of females sampled from over 200 ran-domly chosen isofemale lines that were frozen prior to thedesiccation assay We chose this random control approachinstead of comparing the phenotypic extremes (ie the latemortality tail vs the ldquoearly mortalityrdquo tail) because unfavor-able environmental conditions experienced by the wild-caught dam may inflate environmental variance due to car-ryover effects in the F1 offspring (Schiffer et al 2013)
DNA Extraction and Sequencing
We used pooled whole-genome sequencing (Pool-seq)(Futschik and Schleurootterer 2010) to estimate and compareallele frequency differences between the most desiccation-resistant and randomly sampled flies from a wild populationto associate naturally segregating candidate SNPs with theresponse to low humidity For each treatment (resistantand control) genomic DNA was extracted from 50 femaleheads using the Qiagen DNeasy Blood and Tissue Kit accord-ing to the manufacturerrsquos instructions (Qiagen HildenGermany) Two extractions were combined to create fivepools per treatment which were barcoded separately tocreate technical replicates (total 10 pools of 100 fliesn frac14 1000) The DNA was fragmented using a CovarisS2 ma-chine (Covaris Inc Woburn MA) and libraries were pre-pared from 1 mg genomic DNA (per pool of 100 flies) usingthe Illumina TruSeq Prep module (Illumina San Diego CA)To minimize the effect of variation across lanes the ten poolswere combined in equal concentration (fig 3) and run mul-tiplexed in each lane of a full flow cell of an Illumina HiSeq2000 sequencer Clusters were generated using the TruSeq PECluster Kit v5 on a cBot and sequenced using Illumina TruSeqSBS v3 chemistry (Illumina) Fragmentation library construc-tion and sequencing were performed at the Micromon Next-Gen Sequencing facility (Monash University ClaytonAustralia)
Fig 3 Experimental design See text for further explanation
10
Data Processing and Mapping
Processing was performed on a high-performance computingcluster using the Rubra pipeline system that makes use of theRuffus Python library (Goodstadt 2010) The final variant-calling pipeline was based on a pipeline developed by ClareSloggett (Sloggett et al 2014) and is publicly available athttpsgithubcomgriffinpGWAS_pipeline (last accessedJanuary 13 2016) Raw sequence reads were processed persample per lane with P1 and P2 read files processed sepa-rately initially Trimming was performed with trimmomatic v030 (Lohse et al 2012) Adapter sequences were also re-moved Leading and trailing bases with a quality scorebelow 30 were trimmed and a sliding window (width frac14 10quality threshold frac14 25) was used Reads shorter than 40 bpafter trimming were discarded Quality before and after trim-ming was examined using FastQC v 0101 (Andrews 2014)
Posttrimming reads that remained paired were used asinput into the alignment step Alignment to the D melano-gaster reference genome version r553 was performed withbwa v 062 (Li and Durbin 2009) using the bwa aln commandand the following options No seeding (-l 150) 1 rate ofmissing alignments given 2 uniform base error rate (-n001) a maximum of 2 gap opens per read (-o 2) a maximumof 12 gap extensions per read (-e 12) and long deletionswithin 12 bp of 30 end disallowed (-d 12) The default valueswere used for all other options Output files were then con-verted to sorted bam files using the bwa sampe commandand the ldquoSortSamrdquo option in Picard v 196
At this point the bam files were merged into one file persample using PicardMerge Duplicates were identified withPicard MarkDuplicates The tool RealignerTargetCreator inGATK v 26-5 (DePristo et al 2011) was applied to the tenbam files simultaneously producing a list of intervals contain-ing probable indels around which local realignment would beperformed This was used as input for the IndelRealigner toolwhich was also applied to all ten bam files simultaneously
To investigate the variation among technical replicatesallele frequency was estimated based on read counts foreach technical replicate separately for 100000 randomly cho-sen SNPs across the genome After excluding candidate des-iccation-resistance SNPs and SNPs that may have been falsepositives due to sequencing error (those with mean frequencylt004 across replicates) 1803 SNPs remained For each locuswe calculated the variance in allele frequency estimate amongthe five control replicates and the variance among the fivedesiccation-resistance replicates We compared this with thevariance due to pool category (control vs desiccation resis-tant) using analysis of variance (ANOVA) We also calculatedthe mean pairwise difference in allele frequency estimateamong control and among desiccation-resistance replicatesand the mean concordance correlation coefficient was calcu-lated over all pairwise comparisons within pool category(using the epiccc function in the epiR package Stevensonet al 2015)
The following approach was then taken to identify SNPsdiffering between the desiccation-resistant and controlgroups Based on the technical replicate analysis we merged
the five ldquodesiccation-resistantrdquo and the five random controlsamples into two bam files A pileup file was created withsamtools v 0119 (Li et al 2009) and converted to a ldquosyncrdquo fileusing the mpileup2sync script in Popoolation2 v 1201 (Kofleret al 2011) retaining reads with a mapping quality of at least20 and bases with a minimum quality score of 20 Reads withunmapped mates were also retained (option ndashA in samtoolsmpileup) For this step we used a repeat-masked version ofthe reference genome created with RepeatMasker 403 (Smitet al 2013-2015) and a repeat file containing all annotatedtransposons from D melanogaster including shared ancestralsequences (Flybase release r557) plus all repetitive elementsfrom RepBase release v1904 for D melanogaster Simple re-peats small RNA genes and bacterial insertions were notmasked (options ndashnolow ndashnorna ndashno_is) The rmblastsearch engine was used
Association Analysis
We then performed Fisherrsquos exact test with the Fisher testscript in Popoolation2 (Kofler et al 2011) We set the min-imum count (of the minor allele) frac14 10 minimum coverage(in both populations) frac14 20 and maximum coverage frac1410000 The test was performed for each SNP (slidingwindow off) Although there are numerous approaches tocalculating genome-wide significance thresholds in aPoolseq framework currently there is no consensus on anappropriate standardized statistical method Methods in-clude simulation approaches (Orozco-terWengel et al2012 Bastide et al 2013 Burke et al 2014) correctionusing a null distribution based on estimating genetic driftfrom replicated artificial selection line allele frequency vari-ance (Turner and Miller 2012) use of a simple thresholdbased on a minimum allele frequency difference (Turneret al 2010) and the FDR correction suggested by Storeyand Tibshirani (2003) (used by Fabian et al 2012)
We determined based on our design to calculate the FDRthreshold by comparison with a simulated distribution of nullP values following the approach of Bastide et al (2013) Foreach SNP locus a P value was calculated by simulating thedistribution of read counts between the major and minorallele according to a beta-binomial distribution with meana(a+b) and variance (ab)((a+b)2(a+b+1)) Thismodel allowed for two types of sampling variationStochastic variation in allele sampling across the phenotypesand the background variation in the representation of allelesin each pool caused by library preparation artifacts or sto-chastic variation driven by unequal coverage of the poolsThese sources of variation have also been recognized else-where as being important in interpreting pooled sequenceallele calls and identifying significant differentiation betweenpooled samples (Lynch et al 2014) The value of afrac14 37 waschosen by chi-square test as the best match to the observedP value distribution (afrac14 10 to afrac14 40 were tested) with acorresponding bfrac14 4315 to reflect the coverage depth differ-ence between the random control (487) and desiccation-resistant (568) pools These values were then used tosimulate a null data set 10 the size of the observed data
11
set Because the null P value distribution still did not fit theobserved P value distribution particularly well (supplemen-tary fig S2 Supplementary Material online) these values werenot used for a standard FDR correction Instead a P valuethreshold was calculated based on the lowest 005 of thenull P values similar in approach to Orozco-terWengel et al(2012)
Inversion Analysis
In D melanogaster several cosmopolitan inversion polymor-phisms show cross-continent latitudinal patterns that explaina substantial proportion of clinal variation for many traitsincluding thermal tolerance (reviewed in Hoffmann andWeeks 2007) Patterns of disequilibria generated betweenloci in the vicinity of the rearrangement can obscure signalsof selection (Hoffmann and Weeks 2007) and inversion fre-quencies are routinely considered in genomic studies ofnatural populations sampled from known latitudinal clines(Fabian et al 2012 Kapun et al 2014) We assessedassociations between inversion frequencies and our controland resistant pools at SNPs diagnostic for the following in-versions In3R(Payne) (Anderson et al 2005 Kapun et al2014) In(2L)t (Andolfatto et al 1999 Kapun et al 2014)In(2R)Ns In(3L)P In(3R)C In(3R)K and In(3R)Mo (Kapunet al 2014) Between one and five diagnostic SNPs were in-vestigated for each inversion (depending on availability) Weconsidered an inversion to be associated with desiccationresistance if the frequency of its diagnostic SNP differed sig-nificantly between the control and desiccation-resistant poolsby a Fisherrsquos exact test
Genome Feature Annotation
The list of candidate SNPs was converted to vcf format using acustom R script setting the alternate allele as the allele thatincreased in frequency from the control to the desiccation-resistant pool and setting the reference allele as the otherallele present in cases where both the major and minor allelesdiffered from the reference genome The genomic features inwhich the candidate SNPs occurred were annotated usingsnpEff v40 first creating a database from the D melanogasterv553 genome annotation with the default approach(Cingolani et al 2012) The default annotation settings wereused except for setting the parameter ldquo-ud 0rdquo to annotateSNPs to genes only if they were within the gene body Using2 tests on SNP counts we tested for over- or under-enrichment of feature types by calculating the proportionof lsquoallrsquo features from all SNPs detected in the study and theproportion of ldquocandidaterdquo SNPs identified as differentiated indesiccation-resistant flies (Fabian et al 2012)
GO Analysis
GO analyses was used to associate the candidates with theirbiological functions (Ashburner et al 2000) Previously devel-oped for biological pathway analysis of global gene expressionstudies (Wang et al 2010) typical GO analysis assumes that allgenes are sampled independently with equal probabilityGWAS violate these criteria where SNPs are more likely to
occur in longer genes than in shorter genes resulting in highertype I error rates in GO categories overpopulated with longgenes Gene length as well as overlapping gene biases (Wanget al 2011) were accounted for using Gowinda (Kofler andSchleurootterer 2012) which implements a permutation strategyto reduce false positives Gowinda does not reconstruct LDbetween SNPs but operates in two modes underpinned bytwo extreme assumptions 1) Gene mode (assumes all SNPs ina gene are in LD) and 2) SNP mode (assumes all SNPs in agene are completely independent)
The candidates were examined in both modes but giventhe stringency of gene mode high-resolution SNP modeoutput was utilized for the final analysis with the followingparameters 106 simulations minimum significance frac141 andminimum genes per GO categoryfrac14 3 (to exclude gene-poorcategories) P values were corrected using the Gowinda FDRalgorithm GO annotations from Flybase r557 were applied
Candidate Gene Analysis
In conjunction with the GO analysis we also examinedknown candidate genes associated with survival under lowRH to better characterize possible biological functions of ourGWAS candidates We took a conservative approach to can-didate gene assignment where the SNPs were mapped withinthe entire gene region without up- and downstream exten-sions Genes were curated from FlyBase and the literature(table 1) predominantly using detailed functional experimen-tal evidence as criteria for functional desiccation candidates
Network Analysis
We next obtained a higher level summary of the candidategene list using PPI network analysis The full candidate genelist was used to construct a first-order interaction networkusing all Drosophila PPIs listed in the Drosophila InteractionDatabase v 2014_10 (avoiding interologs) (Murali et al 2011)This ldquofullrdquo PPI network includes manually curated data frompublished literature and experimental data for 9633 proteinsand 93799 interactions For the network construction theigraph R package was used to build a subgraph from the fullPPI network containing the candidate proteins and their first-order interacting proteins (Csardi and Nepusz 2006) Self-connections were removed Functional enrichment analysiswas conducted on the list of network genes using FlyMinev400 (wwwflymineorg last accessed January 13 2016) withthe following parameters for GO enrichment and KEGGReactome pathway enrichment BenjaminindashHochberg FDRcorrection P lt 005 background frac14 all D melanogastergenes in our full PPI network and Ontology (for GO enrich-ment only) frac14 biological process or molecular function
Cross-Study Comparison Overlap with Our PreviousWork
Following on from the comparison between our gene list andthe candidate genes suggested from the literature we quan-titatively evaluated the degree of overlap between the candi-date genes identified in this study with alleles mapped in ourprevious study of flies collected from the same geographic
12
region (Telonis-Scott et al 2012) The ldquooverlaprdquo analysis wasperformed similarly for the GWAS candidate gene and net-work-level approaches First the gene list overlap was testedbetween the two studies using Fisherrsquos exact test where the382 genes identified from the candidate SNPs (this study)were compared with 416 genes identified from candidatesingle feature polymorphisms (Telonis-Scott et al 2012)The contingency table was constructed using the numberof annotated genes in the D melanogaster R553 genomebuilt as an approximation for the total number of genestested in each study and took the following form
where A frac14 gene list from Telonis-Scott et al (2012)n frac14 416 B frac14 gene list from this study n frac14 382 N frac14 totalnumber of genes in D melanogaster reference genome r553n frac14 17106
To annotate genome features in the overlap candidategene list we used the SNP data (this study) which betterresolves alleles than array-based genotyping Due to this lim-itation we did not compare exact genomic coordinates be-tween the studies but considered overlap to occur whendifferentiated alleles were detected in the same gene inboth studies We did however examine the allele frequenciesaround the Dys-RF promoter which was sequenced sepa-rately revealing a selective sweep in the experimental evolu-tion lines (Telonis-Scott et al 2012)
The overlapping candidates were tested for functional en-richment using a basic gene list approach As genes ratherthan SNPs were analyzed here Fly Mine v400 (wwwflymineorg) was applied with the following parameters Gene lengthnormalization BenjaminindashHochberg FDR correction Plt 005Ontology frac14 biological process and default background (allGO-annotated D melanogaster genes)
Finally we constructed a second PPI network from thecandidate genes mapped in Telonis-Scott et al (2012) to ex-amine system-level overlap between the two studies The 416genes mapped to 337 protein seeds and the network wasconstructed as described above The degree of network over-lap between the two studies was tested using simulation Ineach iteration a gene set of the same length as the list fromthis study and a gene set of the same length as the list fromTelonis-Scott et al (2012) were resampled randomly from theentire D melanogaster mapped gene annotation (r553) Eachresampled set was then used to build a first-order network aspreviously described The overlap network between the tworesampled sets was calculated using the graphintersectioncommand from the igraph package and zero-degree nodes(proteins with no interactions) were removed Overlap wasquantified by counting the number of nodes and the numberof edges in this overlap network The observed overlap wascompared with a simulated null distribution for both nodeand edge number where n frac14 1000 simulation iterations
Supplementary MaterialSupplementary figures S1ndashS4 and tables S1ndashS5 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
We thank Robert Good for the fly sampling and JenniferShirriffs and Lea and Alan Rako for outstanding assistancewith laboratory culture and substantial desiccation assaysWe are grateful to Melissa Davis for advice on network build-ing and comparison approaches and to five anonymous re-viewers whose comments improved the manuscript Thiswork was supported by a DECRA fellowship DE120102575and Monash University Technology voucher scheme toMTS an ARC Future Fellowship and Discovery Grants toCMS an ARC Laureate Fellowship to AAH and a Scienceand Industry Endowment Fund grant to CMS and AAH
ReferencesAlexander JM Lomvardas S 2014 Nuclear architecture as an epigenetic
regulator of neural development and function Neuroscience26439ndash50
Anderson AR Hoffmann AA McKechnie SW Umina PA Weeks AR2005 The latitudinal cline in the In(3R)Payne inversion polymor-phism has shifted in the last 20 years in Australian Drosophilamelanogaster populations Mol Ecol 14851ndash858
Andolfatto P Wall JD Kreitman M 1999 Unusual haplotype structureat the proximal breakpoint of In(2L)t in a natural population ofDrosophila melanogaster Genetics 1531297ndash1311
Andrews S 2014 FastQC Available from httpwwwbioinformaticsbbsrcacukprojectsfastqc
Ashburner M Ball CA Blake JA Botstein D Butler H Cherry JM DavisAP Dolinski K Dwight SS Eppig JT et al 2000 Gene ontology toolfor the unification of biology The Gene Ontology Consortium NatGenet 2525ndash29
Aytes A Mitrofanova A Lefebvre C Alvarez Mariano J Castillo-MartinM Zheng T Eastham James A Gopalan A Pienta Kenneth J ShenMM et al 2014 Cross-species regulatory network analysis identifiesa synergistic interaction between FOXM1 and CENPF that drivesprostate cancer malignancy Cancer Cell 25638ndash651
Bastide H Betancourt A Nolte V Tobler R Steuroobe P Futschik ASchleurootterer C 2013 A genome-wide fine-scale map of natural pig-mentation variation in Drosophila melanogaster PLoS Genet9e1003534
Bergland AO Behrman EL OrsquoBrien KR Schmidt PS Petrov DA 2014Genomic evidence of rapid and stable adaptive oscillations overseasonal time scales in Drosophila PLoS Genet 10e1004775
Bonebrake TC Mastrandrea MD 2010 Tolerance adaptation and pre-cipitation changes complicate latitudinal patterns of climate changeimpacts Proc Natl Acad Sci U S A 10712581ndash12586
Bradley TJ 2009 Animal osmoregulation Oxford Oxford UniversityPress
Burke MK Liti G Long AD 2014 Standing genetic variation drivesrepeatable experimental evolution in outcrossing populations ofSaccharomyces cerevisiae Mol Biol Evol 313228ndash3239
Chown SL 2012 Trait-based approaches to conservation physiologyforecasting environmental change risks from the bottom upPhilos Trans R Soc B 3671615ndash1627
Chown SL Nicolson SW 2004 Insect physiological ecology mechanismsand patterns Oxford Oxford University Press
Chown SL Soslashrensen JG Terblanche JS 2011 Water loss in insectsan environmental change perspective J Insect Physiol571070ndash1084
Not in GWAS In GWAS
Not in Telonis-Scott et al (2012) N ndash intersect-AndashB A-intersect
In Telonis-Scott et al (2012) B-intersect Intersect(AB)
13
Chung H Loehlin DW Dufour HD Vaccarro K Millar JG Carroll SB2014 A single gene affects both ecological divergence and matechoice in Drosophila Science 3431148ndash1151
Cingolani P Platts A Wang LL Coon M Nguyen T Wang L Land SJ LuX Ruden DM 2012 A program for annotating and predicting theeffects of single nucleotide polymorphisms SnpEff SNPs in thegenome of Drosophila melanogaster strain w1118 iso-2 iso-3 Fly(Austin) 680ndash92
Civelek M Lusis AJ 2014 Systems genetics approaches to understandcomplex traits Nat Rev Genet 1534ndash48
Clusella-Trullas S Blackburn TM Chown SL 2011 Climatic predictors oftemperature performance curve parameters in ectotherms implycomplex responses to climate change Am Nat 177738ndash751
Csardi G Nepusz T 2006 The igraph software package for complexnetwork research InterJournal Complex Systems 1695 Availablefrom httpigraphorg
Davies SA Cabrero P Overend G Aitchison L Sebastian S Terhzaz SDow JA 2014 Cell signalling mechanisms for insect stress toleranceJ Exp Biol 217119ndash128
Davies SA Cabrero P Povsic M Johnston NR Terhzaz S Dow JA 2013Signaling by Drosophila capa neuropeptides Gen Comp Endocrinol18860ndash66
Davies SA Overend G Sebastian S Cundall M Cabrero P Dow JATerhzaz S 2012 Immune and stress response lsquocross-talkrsquo in theDrosophila Malpighian tubule J Insect Physiol 58488ndash497
Day JP Dow JA Houslay MD Davies SA 2005 Cyclic nucleotide phos-phodiesterases in Drosophila melanogaster Biochem J 388333ndash342
de Nadal E Ammerer G Posas F 2011 Controlling gene expression inresponse to stress Nat Rev Genet 12833ndash845
DePristo MA Banks E Poplin R Garimella KV Maguire JR Hartl CPhilippakis AA del Angel G Rivas MA Hanna M et al 2011 Aframework for variation discovery and genotyping using next-gen-eration DNA sequencing data Nat Genet 43491ndash498
Djawdan M Chippindale AK Rose MR Bradley TJ 1998 Metabolic re-serves and evolved stress resistance in Drosophila melanogasterPhysiol Zool 71584ndash594
Durham MF Magwire MM Stone EA Leips J 2014 Genome-wide anal-ysis in Drosophila reveals age-specific effects of SNPs on fitness traitsNat Commun 54338
Emilsson V Thorleifsson G Zhang B Leonardson AS Zink F Zhu JCarlson S Helgason A Walters GB Gunnarsdottir S et al 2008Genetics of gene expression and its effect on disease Nature452423ndash428
Fabian DK Kapun M Nolte V Kofler R Schmidt PS Schleurootterer C FlattT 2012 Genome-wide patterns of latitudinal differentiation amongpopulations of Drosophila melanogaster from North America MolEcol 214748ndash4769
Foley BR Telonis-Scott M 2011 Quantitative genetic analysis suggestscausal association between cuticular hydrocarbon composition anddesiccation survival in Drosophila melanogaster Heredity 10668ndash77
Folk DG Han C Bradley TJ 2001 Water acquisition and partitioning inDrosophila melanogaster effects of selection for desiccation-resis-tance J Exp Biol 2043323ndash3331
Fowler MA Montell C 2013 Drosophila TRP channels and animal be-havior Life Sci 92394ndash403
Fu J Keurentjes JJB Bouwmeester H America T Verstappen FWA WardJL Beale MH de Vos RCH Dijkstra M Scheltema RA et al 2009System-wide molecular evidence for phenotypic buffering inArabidopsis Nat Genet 41166ndash167
Futschik A Schleurootterer C 2010 The next generation of molecular mar-kers from massively parallel sequencing of pooled DNA samplesGenetics 186207ndash218
Gibbs AG Fukuzato F Matzkin LM 2003 Evolution of water conserva-tion mechanisms in Drosophila J Exp Biol 2061183ndash1192
Gibbs AG Matzkin LM 2001 Evolution of water balance in the genusDrosophila J Exp Biol 2042331ndash2338
Goodstadt L 2010 Ruffus a lightweight Python library for computa-tional pipelines Bioinformatics 262778ndash2779
Hadley NF 1994 Water relations of terrestrial arthropods San Diego(CA) Academic Press
Hoffmann AA Hallas R Sinclair C Mitrovski P 2001 Levels of variationin stress resistance in Drosophila among strains local populationsand geographic regions patterns for desiccation starvation coldresistance and associated traits Evolution 551621ndash1630
Hoffmann AA Parsons PA 1989a An integrated approach to environ-mental stress tolerance and life-history variation desiccation toler-ance in Drosophila Biol J Linn Soc 37117ndash136
Hoffmann AA Parsons PA 1989b Selection for increased desiccationresistance in Drosophila melanogaster additive genetic control andcorrelated responses for other stresses Genetics 122837ndash845
Hoffmann AA Sgro CM 2011 Climate change and evolutionary adap-tation Nature 470479ndash485
Hoffmann AA Weeks AR 2007 Climatic selection on genes and traitsafter a 100 year-old invasion a critical look at the temperate-tropicalclines in Drosophila melanogaster from eastern Australia Genetica129133ndash147
Huang W Richards S Carbone MA Zhu D Anholt RR Ayroles JFDuncan L Jordan KW Lawrence F Magwire MM et al 2012Epistasis dominates the genetic architecture of Drosophila quanti-tative traits Proc Natl Acad Sci U S A 10915553ndash15559
Huang Z Tunnacliffe A 2004 Response of human cells to desiccationcomparison with hyperosmotic stress response J Physiol558181ndash191
Huang Z Tunnacliffe A 2005 Gene induction by desiccation stress inhuman cell cultures FEBS Lett 5794973ndash4977
Ichihashi Y Aguilar-Martınez JA Farhi M Chitwood DH Kumar RMillon LV Peng J Maloof JN Sinha NR 2014 Evolutionary develop-mental transcriptomics reveals a gene network module regulatinginterspecific diversity in plant leaf shape Proc Natl Acad Sci U S A111E2616ndashE2621
Itoh M Nanba N Hasegawa M Inomata N Kondo R Oshima MTakano-Shimizu T 2010 Seasonal changes in the long-distance link-age disequilibrium in Drosophila melanogaster J Hered 10126ndash32
Johnson WA Carder JW 2012 Drosophila nociceptors mediate larvalaversion to dry surface environments utilizing both the painless TRPchannel and the DEGENaC subunit PPK1 PLoS One 7e32878
Kahsai L Kapan N Dircksen H Winther AM Nassel DR 2010 Metabolicstress responses in Drosophila are modulated by brain neurosecre-tory cells that produce multiple neuropeptides PLoS One 5e11480
Kapun M van Schalkwyk H McAllister B Flatt T Schleurootterer C 2014Inference of chromosomal inversion dynamics from Pool-Seq datain natural and laboratory populations of Drosophila melanogasterMol Ecol 231813ndash1827
Kawano T Shimoda M Matsumoto H Ryuda M Tsuzuki S Hayakawa Y2010 Identification of a gene Desiccate contributing to desicca-tion resistance in Drosophila melanogaster J Biol Chem28538889ndash38897
Kellermann V van Heerwaarden B Sgro CM Hoffmann AA 2009Fundamental evolutionary limits in ecological traits driveDrosophila species distributions Science 3251244ndash1246
Kofler R Nolte V Schleurootterer C 2016 The impact of library preparationprotocols on the consistency of allele frequency estimates in Pool-Seq data Mol Ecol Resour 16118ndash122
Kofler R Pandey RV Schleurootterer C 2011 PoPoolation2 identifying dif-ferentiation between populations using sequencing of pooled DNAsamples (Pool-Seq) Bioinformatics 273435ndash3436
Kofler R Schleurootterer C 2012 Gowinda unbiased analysis of gene setenrichment for genome-wide association studies Bioinformatics282084ndash2085
Leiserson MDM Eldridge JV Ramachandran S Raphael BJ 2013Network analysis of GWAS data Curr Opin Genet Dev 23602ndash610
Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 251754ndash1760
Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079
14
Liu Y Luo J Carlsson MA Neuroassel DR 2015 Serotonin and insulin-likepeptides modulate leucokinin-producing neurons that affectfeeding and water homeostasis in Drosophila J Comp Neurol5231840ndash1863
Lohse M Bolger AM Nagel A Fernie AR Lunn JE Stitt M Usadel B 2012RobiNA a user-friendly integrated software solution for RNA-Seq-based transcriptomics Nucleic Acids Res 40W622ndashW627
Long A Liti G Luptak A Tenaillon O 2015 Elucidating the moleculararchitecture of adaptation via evolve and resequence experimentsNat Rev Genet 16567ndash582
Lynch M Bost D Wilson S Maruki T Harrison S 2014 Population-genetic inference from pooled-sequencing data Genome Biol Evol61210ndash1218
Mackay TFC Stone EA Ayroles JF 2009 The genetics of quantitativetraits challenges and prospects Nat Rev Genet 10565ndash577
Marron MT Markow TA Kain KJ Gibbs AG 2003 Effects of starvationand desiccation on energy metabolism in desert and mesicDrosophila J Insect Physiol 49261ndash270
Matzkin LM Markow TA 2009 Transcriptional regulation of metabo-lism associated with the increased desiccation resistance of thecactophilic Drosophila mojavensis Genetics 1821279ndash1288
Murali T Pacifico S Yu J Guest S Roberts GG Finley RL Jr 2011 DroID2011 a comprehensive integrated resource for protein transcrip-tion factor RNA and gene interactions for Drosophila Nucleic AcidsRes 39D736ndashD743
Nuzhdin SV Turner TL 2013 Promises and limitations of hitchhikingmapping Curr Opin Genet Dev 23694ndash699
Orozco-terWengel P Kapun M Nolte V Kofler R Flatt T Schleurootterer C2012 Adaptation of Drosophila to a novel laboratory environmentreveals temporally heterogeneous trajectories of selected alleles MolEcol 214931ndash4941
Picard Picard A set of Java command line tools for manipulating high-throughput seqeuncing (HTS) data and formats Available fromhttpbroadinstitutegithubiopicard
Qiu Y Tittiger C Wicker-Thomas C Le Goff G Young S Wajnberg EFricaux T Taquet N Blomquist GJ Feyereisen R 2012 An insect-specific P450 oxidative decarbonylase for cuticular hydrocarbon bio-synthesis Proc Natl Acad Sci U S A 10914858ndash14863
Rajpurohit S Oliveira CC Etges WJ Gibbs AG 2013 Functional genomicand phenotypic responses to desiccation in natural populations of adesert Drosophilid Mol Ecol 222698ndash2715
Sarup P Soslashrensen JG Kristensen TN Hoffmann AA Loeschcke V PaigeKN Soslashrensen P 2011 Candidate genes detected in transcriptomestudies are strongly dependent on genetic background PLoS One6e15644
Schiffer M Hangartner S Hoffmann AA 2013 Assessing the relativeimportance of environmental effects carry-over effects and speciesdifferences in thermal stress resistance a comparison ofDrosophilids across field and laboratory generations J Exp Biol2163790ndash3798
Schleurootterer C Kofler R Versace E Tobler R Franssen SU 2015Combining experimental evolution with next-generation sequen-cing A powerful tool to study adaptation from standing geneticvariation Heredity 114431ndash440
Sinclair BJ Gibbs AG Roberts SP 2007 Gene transcription during ex-posure to and recovery from cold and desiccation stress inDrosophila melanogaster Insect Mol Biol 16435ndash443
Sloggett C Wakefield M Philip G Pope B 2014 Rubramdashflexible distrib-uted pipelines figshare Available from httpdxdoiorg106084m9figshare895626
Smit AFA Hubley R Green P 2013-2015 RepeatMasker Open-403Available from httpwwwrepeatmaskerorg
Seurooderberg JA Birse RT Neuroassel DR 2011 Insulin production and signalingin renal tubules of Drosophila is under control of tachykinin-relatedpeptide and regulates stress resistance PLoS One 6e19866
Soslashrensen JG Nielsen MM Loeschcke V 2007 Gene expression profileanalysis of Drosophila melanogaster selected for resistance to envi-ronmental stressors J Evol Biol 201624ndash1636
St Pierre SE Ponting L Stefancsik R McQuilton P 2014 FlyBase102mdashadvanced approaches to interrogating FlyBase Nucleic AcidsRes 42D780ndashD788
Stevenson M Nunes T Heuser C Marshall J Sanchez J Thornton RReiczigel J Robison-Cox J Sebastiani P Solymos P et al 2015 epiRTools for the analysis of epidemiological data R package version09-62 Available from httpCRANR-projectorgpackagefrac14epiR
Storey JD Tibshirani R 2003 Statistical significance for genomewidestudies Proc Natl Acad Sci U S A 1009440ndash9445
Telonis-Scott M Gane M DeGaris S Sgro CM Hoffmann AA 2012 Highresolution mapping of candidate alleles for desiccation resistancein Drosophila melanogaster under selection Mol Biol Evol291335ndash1351
Telonis-Scott M Guthridge KM Hoffmann AA 2006 A new set oflaboratory-selected Drosophila melanogaster lines for the analysisof desiccation resistance response to selection physiology and cor-related responses J Exp Biol 2091837ndash1847
Terhzaz S Cabrero P Robben JH Radford JC Hudson BD Milligan GDow JA Davies SA 2012 Mechanism and function of Drosophilacapa GPCR a desiccation stress-responsive receptor with functionalhomology to human neuromedinU receptor PLoS One 7e29897
Terhzaz S Overend G Sebastian S Dow JA Davies SA 2014 The Dmelanogaster capa-1 neuropeptide activates renal NF-kB signalingPeptides 53218ndash224
Terhzaz S Teets NM Cabrero P Henderson L Ritchie MG Nachman RJDow JA Denlinger DL Davies SA 2015 Insect capa neuropeptidesimpact desiccation and cold tolerance Proc Natl Acad Sci U S A1122882ndash2887
Thorat LJ Gaikwad SM Nath BB 2012 Trehalose as an indicator ofdesiccation stress in Drosophila melanogaster larvae a potentialmarker of anhydrobiosis Biochem Biophys Res Commun419638ndash642
Tobler R Franssen SU Kofler R Orozco-terWengel P Nolte VHermisson J Schl eurootterer C 2014 Massive habitat-specific ge-nomic response in D melanogaster populations during exper-imental evolution in hot and cold environments Mol Biol Evol31364ndash375
Turner TL Bourne EC Von Wettberg EJ Hu TT Nuzhdin SV 2010Population resequencing reveals local adaptation of Arabidopsislyrata to serpentine soils Nat Genet 42260ndash263
Turner TL Miller PM 2012 Investigating natural variation in Drosophilacourtship song by the evolve and resequence approach Genetics191633ndash642
Vermeulen CJ Soslashrensen P Kirilova Gagalova K Loeschcke V 2013Transcriptomic analysis of inbreeding depression in cold-sensitiveDrosophila melanogaster shows upregulation of the immune re-sponse J Evol Biol 261890ndash1902
Wang B Goode J Best J Meltzer J Schilman PE Chen J Garza D ThomasJB Montminy M 2008 The insulin-regulated CREB coactivatorTORC promotes stress resistance in Drosophila Cell Metab7434ndash444
Wang J Kean L Yang J Allan AK Davies SA Herzyk P Dow JA 2004Function-informed transcriptome analysis of Drosophila renaltubule Genome Biol 5R69
Wang K Li M Hakonarson H 2010 Analysing biological pathways ingenome-wide association studies Nat Rev Genet 11843ndash854
Wang L Jia P Wolfinger RD Chen X Zhao Z 2011 Gene set analysis ofgenome-wide association studies methodological issues and per-spectives Genomics 981ndash8
Weber AL Khan GF Magwire MM Tabor CL Mackay TF Anholt RR2012 Genome-wide association analysis of oxidative stress resistancein Drosophila melanogaster PLoS One 7e34745
Weeks AR McKechnie SW Hoffmann AA 2002 Dissecting adaptiveclinal variation markers inversions and sizestress associations inDrosophila melanogaster from a central field population Ecol Lett5756ndash763
Zhu Y Bergland AO Gonzalez J Petrov DA 2012 Empirical validation ofpooled whole genome population re-sequencing in Drosophila mel-anogaster PLoS One 7e41901
15
and Weeks 2007 Tobler et al 2014) but we found little as-sociation between the desiccation pool and inversion fre-quency changes consistent with the lack of latitudinalpatterns in desiccation resistanceand traitinversion fre-quency associations in east Australian D melanogaster(Hoffmann et al 2001 Weeks et al 2002)
Cross-Study Comparison Reveals a Core Set ofCandidates for a Complex Trait
Our data contain many candidate genes functions and path-ways for insect desiccation resistance but partitioning genu-ine adaptive loci from experimental noise remains asignificant challenge We therefore focus our efforts on ournovel comparative analysis where we identified 45 robustcandidates for further exploration Common genetic signa-tures are rarely documented in cross-study comparisons ofcomplex traits (Sarup et al 2011 Huang et al 2012 but cfVermeulen et al 2013) and never before for desiccation re-sistance Some degree of overlap between our studies likelyresults from a shared genetic background from similar sam-pling sites highlighting the replicability of these responsesOne limitation of this approach is the overpopulation ofthe ldquocorerdquo list by longer genes This result reflects the genelength bias inherent to GWAS approaches an issue that is notstraightforward to correct in a GWAS let alone in the com-parative framework applied here A focus on the commongenes functionally linked to the desiccation phenotype canprovide a meaningful framework for future research and alsosuggests that some large-effect candidate genes exhibit natu-ral variation contributing to the phenotype (table 1)
Other feasible candidates include genes expressed inthe MTs (Wang et al 2004) or essential for MT develop-ment (St Pierre et al 2014) providing further researchopportunity focused on the primary stress sensing andfluid secretion tissues Genes include one of the mosthighly expressed MT transcripts CG7084 as well as dncNckx30C slo cv-c and nkd Other aspects of stress resis-tance are also implicated CrebB for example is involvedin regulating insulin-regulated stress resistance and ther-mosensory behavior (Wang et al 2008) nkd also impactslarval cuticle development and shares a PDZ signalingdomain also present in the desiccation candidate desi(Kawano et al 2010 St Pierre et al 2014) Finally severalcore candidates were consistently reported in recent stud-ies suggesting generalized roles in stress responses andfitness including transcriptome studies on MT function(Wang et al 2004) inbreeding and cold resistance(Vermeulen et al 2013) oxidative stress (Weber et al2012) and genomic studies exploring age-specific SNPson fitness traits (Durham et al 2014) and genes likelyunder spatial selection in flies from climatically divergenthabitats (Fabian et al 2012)
Several of the enriched GO categories from the core listalso make biological sense including stimulus response(almost half the suite) defense and signaling particularlyin pathways involved in MT stress sensing (Davies et al 20122013 2014) Functional categories enriched for sensory
organ development were highlighted consistent with envi-ronmental sensing categories from desert Drosophila tran-scriptomics (Matzkin and Markow 2009 Rajpurohit et al2013) and stress detection via phototransduction pathwaysin D melanogaster under a range of stressors including des-iccation (Soslashrensen et al 2007)
Evidence for Conserved Signatures at Higher LevelOrganization
Beyond the gene level we observed a striking degree ofsimilarity between this study and that of Telonis-Scottet al (2012) at the network level where significantly over-lapping protein networks were constructed from largely dif-ferent seed protein sets This supports a scenario where theldquobiological information flow from DNA to phenotyperdquo(Civelek and Lusis 2014) contains inherent redundancywhere alternative genetic solutions underlie phenotypesand functions Resistance to perturbation by genetic or en-vironmental variation appears to increase with increasinghierarchical complexity that is phenotype ldquobufferingrdquo (Fuet al 2009) The same functional network in different pop-ulationsgenetic backgrounds can be impacted but theexact genes and SNPs responding to perturbation will notnecessarily overlap In Drosophila different sets of loci fromthe same founders mapped to the same PPI network in thecase of two complex traits chill coma recovery and startleresponse (Huang et al 2012) Furthermore interspecificshared networkspathways from different genes have beendocumented for numerous complex traits (Emilsson et al2008 Aytes et al 2014 Ichihashi et al 2014) providing op-portunities for investigating taxa where individual genefunction is poorly understood
Although the null distribution is not always obviouswhen comparing networks and their functions partly be-cause of reduced sensitivity from missing interactions(Leiserson et al 2013) here the shared network signaturesportray a plausible multifaceted stress response involvingstress sensing rapid gene accessibility and expression Thepredominance of immune system and stress responsepathways are of interest with reference immunitystresspathway ldquocross-talkrdquo particularly in the MTs whereabiotic stressors elicit strong immunity transcriptionalresponses (Davies et al 2012) The higher order architec-tures reflect crucial aspects of rapid gene expression con-trol in response to stress from chromatin organization(Alexander and Lomvardas 2014) to transcription RNAdecay and translation (reviwed in de Nadal et al 2011) InDrosophila variation in transcriptional regulation of indi-vidual genes can underpin divergent desiccation pheno-types (Chung et al 2014) or be functionally inferred frompatterns of many genes elicited during desiccation stress(Rajpurohit et al 2013) Variants with potential to mod-ulate expression at different stages of gene expressioncontrol were evident in our GWAS data alone for exam-ple enrichment for 50UTR and splice variants at the SNPlevel and for chromatin and chromosome structure termsat the functional level
9
Genetic Overlap Versus Genetic differencesmdashCausesand Implications
Despite the common signatures at the gene and higher orderlevels between our studies pervasive differences were ob-served These can arise from differences in standing geneticvariation in the natural populations where different alleliccompositions impact gene-by-environment interactions andepistasis Our network overlap analyses highlight the potentialfor different molecular trajectories to result in similar func-tions (Leiserson et al 2013) consistent with the multiplephysiological pathways potentially available to D melanoga-ster under water stress (Chown et al 2011) Experimentaldesign also presumably contributed to the study differencesStrong directional selection as applied in the 2012 study canincrease the frequency of rare beneficial alleles and likely re-duced overall diversity than did the single-generation GWASdesign Furthermore selection regimes often impose addi-tional selective pressures (eg food deprivation during desic-cation) resulting in correlated responses such as starvationresistance with desiccation resistance and these factors differfrom the laboratory to the field Finally evidence suggests innatural D melanogaster that standing genetic variation canvary extensively with sampling season (Itoh et al 2010Bergland et al 2014) Although we cannot compare levelsof standing variation between our two studies evidence sug-gests that this may have affected the Dys gene where ourGWAS approach revealed no differentiation between thecontrol and resistant pools at this locus but rather detectedthe alleles associated with the selective sweep in Telonis-Scottet al (2012) in both pools at high frequency Whether thisresulted from sampling the later collection after a longdrought (2004ndash2008 wwwbomgovauclimate last accessedJanuary 13 2016) is unclear but this observation highlightsthe relevance of temporal sampling to standing genetic var-iation when attempting to link complex phenotypes togenotypes
Materials and Methods
Natural Population Sampling
Drosophila melanogaster was collected from vineyard waste atKilchorn Winery Romsey Australia in April 2010 Over 1200isofemale lines were established in the laboratory from wild-caught females Species identification was conducted on maleF1 to remove Drosophila simulans contamination and over900 D melanogaster isofemale lines were retained All flieswere maintained in vials of dextrose cornmeal media at25 C and constant light
Desiccation Assay
From each isofemale line ten inseminated F1 females weresorted by aspiration without CO2 and held in vials until 4ndash5days old For the desiccation assay 10 females per line (over9000 flies in total) were screened by placing groups of femalesinto empty vials covered with gauze and randomly assigninglines into 5 large sealed glass tanks containing trays of silicadesiccant (relative humidity [RH] lt10) The tanks werescored for mortality hourly and the final 558 surviving
individuals were selected for the desiccation-resistant pool(558 flies lt6 of the total) The control pool constitutedthe same number of females sampled from over 200 ran-domly chosen isofemale lines that were frozen prior to thedesiccation assay We chose this random control approachinstead of comparing the phenotypic extremes (ie the latemortality tail vs the ldquoearly mortalityrdquo tail) because unfavor-able environmental conditions experienced by the wild-caught dam may inflate environmental variance due to car-ryover effects in the F1 offspring (Schiffer et al 2013)
DNA Extraction and Sequencing
We used pooled whole-genome sequencing (Pool-seq)(Futschik and Schleurootterer 2010) to estimate and compareallele frequency differences between the most desiccation-resistant and randomly sampled flies from a wild populationto associate naturally segregating candidate SNPs with theresponse to low humidity For each treatment (resistantand control) genomic DNA was extracted from 50 femaleheads using the Qiagen DNeasy Blood and Tissue Kit accord-ing to the manufacturerrsquos instructions (Qiagen HildenGermany) Two extractions were combined to create fivepools per treatment which were barcoded separately tocreate technical replicates (total 10 pools of 100 fliesn frac14 1000) The DNA was fragmented using a CovarisS2 ma-chine (Covaris Inc Woburn MA) and libraries were pre-pared from 1 mg genomic DNA (per pool of 100 flies) usingthe Illumina TruSeq Prep module (Illumina San Diego CA)To minimize the effect of variation across lanes the ten poolswere combined in equal concentration (fig 3) and run mul-tiplexed in each lane of a full flow cell of an Illumina HiSeq2000 sequencer Clusters were generated using the TruSeq PECluster Kit v5 on a cBot and sequenced using Illumina TruSeqSBS v3 chemistry (Illumina) Fragmentation library construc-tion and sequencing were performed at the Micromon Next-Gen Sequencing facility (Monash University ClaytonAustralia)
Fig 3 Experimental design See text for further explanation
10
Data Processing and Mapping
Processing was performed on a high-performance computingcluster using the Rubra pipeline system that makes use of theRuffus Python library (Goodstadt 2010) The final variant-calling pipeline was based on a pipeline developed by ClareSloggett (Sloggett et al 2014) and is publicly available athttpsgithubcomgriffinpGWAS_pipeline (last accessedJanuary 13 2016) Raw sequence reads were processed persample per lane with P1 and P2 read files processed sepa-rately initially Trimming was performed with trimmomatic v030 (Lohse et al 2012) Adapter sequences were also re-moved Leading and trailing bases with a quality scorebelow 30 were trimmed and a sliding window (width frac14 10quality threshold frac14 25) was used Reads shorter than 40 bpafter trimming were discarded Quality before and after trim-ming was examined using FastQC v 0101 (Andrews 2014)
Posttrimming reads that remained paired were used asinput into the alignment step Alignment to the D melano-gaster reference genome version r553 was performed withbwa v 062 (Li and Durbin 2009) using the bwa aln commandand the following options No seeding (-l 150) 1 rate ofmissing alignments given 2 uniform base error rate (-n001) a maximum of 2 gap opens per read (-o 2) a maximumof 12 gap extensions per read (-e 12) and long deletionswithin 12 bp of 30 end disallowed (-d 12) The default valueswere used for all other options Output files were then con-verted to sorted bam files using the bwa sampe commandand the ldquoSortSamrdquo option in Picard v 196
At this point the bam files were merged into one file persample using PicardMerge Duplicates were identified withPicard MarkDuplicates The tool RealignerTargetCreator inGATK v 26-5 (DePristo et al 2011) was applied to the tenbam files simultaneously producing a list of intervals contain-ing probable indels around which local realignment would beperformed This was used as input for the IndelRealigner toolwhich was also applied to all ten bam files simultaneously
To investigate the variation among technical replicatesallele frequency was estimated based on read counts foreach technical replicate separately for 100000 randomly cho-sen SNPs across the genome After excluding candidate des-iccation-resistance SNPs and SNPs that may have been falsepositives due to sequencing error (those with mean frequencylt004 across replicates) 1803 SNPs remained For each locuswe calculated the variance in allele frequency estimate amongthe five control replicates and the variance among the fivedesiccation-resistance replicates We compared this with thevariance due to pool category (control vs desiccation resis-tant) using analysis of variance (ANOVA) We also calculatedthe mean pairwise difference in allele frequency estimateamong control and among desiccation-resistance replicatesand the mean concordance correlation coefficient was calcu-lated over all pairwise comparisons within pool category(using the epiccc function in the epiR package Stevensonet al 2015)
The following approach was then taken to identify SNPsdiffering between the desiccation-resistant and controlgroups Based on the technical replicate analysis we merged
the five ldquodesiccation-resistantrdquo and the five random controlsamples into two bam files A pileup file was created withsamtools v 0119 (Li et al 2009) and converted to a ldquosyncrdquo fileusing the mpileup2sync script in Popoolation2 v 1201 (Kofleret al 2011) retaining reads with a mapping quality of at least20 and bases with a minimum quality score of 20 Reads withunmapped mates were also retained (option ndashA in samtoolsmpileup) For this step we used a repeat-masked version ofthe reference genome created with RepeatMasker 403 (Smitet al 2013-2015) and a repeat file containing all annotatedtransposons from D melanogaster including shared ancestralsequences (Flybase release r557) plus all repetitive elementsfrom RepBase release v1904 for D melanogaster Simple re-peats small RNA genes and bacterial insertions were notmasked (options ndashnolow ndashnorna ndashno_is) The rmblastsearch engine was used
Association Analysis
We then performed Fisherrsquos exact test with the Fisher testscript in Popoolation2 (Kofler et al 2011) We set the min-imum count (of the minor allele) frac14 10 minimum coverage(in both populations) frac14 20 and maximum coverage frac1410000 The test was performed for each SNP (slidingwindow off) Although there are numerous approaches tocalculating genome-wide significance thresholds in aPoolseq framework currently there is no consensus on anappropriate standardized statistical method Methods in-clude simulation approaches (Orozco-terWengel et al2012 Bastide et al 2013 Burke et al 2014) correctionusing a null distribution based on estimating genetic driftfrom replicated artificial selection line allele frequency vari-ance (Turner and Miller 2012) use of a simple thresholdbased on a minimum allele frequency difference (Turneret al 2010) and the FDR correction suggested by Storeyand Tibshirani (2003) (used by Fabian et al 2012)
We determined based on our design to calculate the FDRthreshold by comparison with a simulated distribution of nullP values following the approach of Bastide et al (2013) Foreach SNP locus a P value was calculated by simulating thedistribution of read counts between the major and minorallele according to a beta-binomial distribution with meana(a+b) and variance (ab)((a+b)2(a+b+1)) Thismodel allowed for two types of sampling variationStochastic variation in allele sampling across the phenotypesand the background variation in the representation of allelesin each pool caused by library preparation artifacts or sto-chastic variation driven by unequal coverage of the poolsThese sources of variation have also been recognized else-where as being important in interpreting pooled sequenceallele calls and identifying significant differentiation betweenpooled samples (Lynch et al 2014) The value of afrac14 37 waschosen by chi-square test as the best match to the observedP value distribution (afrac14 10 to afrac14 40 were tested) with acorresponding bfrac14 4315 to reflect the coverage depth differ-ence between the random control (487) and desiccation-resistant (568) pools These values were then used tosimulate a null data set 10 the size of the observed data
11
set Because the null P value distribution still did not fit theobserved P value distribution particularly well (supplemen-tary fig S2 Supplementary Material online) these values werenot used for a standard FDR correction Instead a P valuethreshold was calculated based on the lowest 005 of thenull P values similar in approach to Orozco-terWengel et al(2012)
Inversion Analysis
In D melanogaster several cosmopolitan inversion polymor-phisms show cross-continent latitudinal patterns that explaina substantial proportion of clinal variation for many traitsincluding thermal tolerance (reviewed in Hoffmann andWeeks 2007) Patterns of disequilibria generated betweenloci in the vicinity of the rearrangement can obscure signalsof selection (Hoffmann and Weeks 2007) and inversion fre-quencies are routinely considered in genomic studies ofnatural populations sampled from known latitudinal clines(Fabian et al 2012 Kapun et al 2014) We assessedassociations between inversion frequencies and our controland resistant pools at SNPs diagnostic for the following in-versions In3R(Payne) (Anderson et al 2005 Kapun et al2014) In(2L)t (Andolfatto et al 1999 Kapun et al 2014)In(2R)Ns In(3L)P In(3R)C In(3R)K and In(3R)Mo (Kapunet al 2014) Between one and five diagnostic SNPs were in-vestigated for each inversion (depending on availability) Weconsidered an inversion to be associated with desiccationresistance if the frequency of its diagnostic SNP differed sig-nificantly between the control and desiccation-resistant poolsby a Fisherrsquos exact test
Genome Feature Annotation
The list of candidate SNPs was converted to vcf format using acustom R script setting the alternate allele as the allele thatincreased in frequency from the control to the desiccation-resistant pool and setting the reference allele as the otherallele present in cases where both the major and minor allelesdiffered from the reference genome The genomic features inwhich the candidate SNPs occurred were annotated usingsnpEff v40 first creating a database from the D melanogasterv553 genome annotation with the default approach(Cingolani et al 2012) The default annotation settings wereused except for setting the parameter ldquo-ud 0rdquo to annotateSNPs to genes only if they were within the gene body Using2 tests on SNP counts we tested for over- or under-enrichment of feature types by calculating the proportionof lsquoallrsquo features from all SNPs detected in the study and theproportion of ldquocandidaterdquo SNPs identified as differentiated indesiccation-resistant flies (Fabian et al 2012)
GO Analysis
GO analyses was used to associate the candidates with theirbiological functions (Ashburner et al 2000) Previously devel-oped for biological pathway analysis of global gene expressionstudies (Wang et al 2010) typical GO analysis assumes that allgenes are sampled independently with equal probabilityGWAS violate these criteria where SNPs are more likely to
occur in longer genes than in shorter genes resulting in highertype I error rates in GO categories overpopulated with longgenes Gene length as well as overlapping gene biases (Wanget al 2011) were accounted for using Gowinda (Kofler andSchleurootterer 2012) which implements a permutation strategyto reduce false positives Gowinda does not reconstruct LDbetween SNPs but operates in two modes underpinned bytwo extreme assumptions 1) Gene mode (assumes all SNPs ina gene are in LD) and 2) SNP mode (assumes all SNPs in agene are completely independent)
The candidates were examined in both modes but giventhe stringency of gene mode high-resolution SNP modeoutput was utilized for the final analysis with the followingparameters 106 simulations minimum significance frac141 andminimum genes per GO categoryfrac14 3 (to exclude gene-poorcategories) P values were corrected using the Gowinda FDRalgorithm GO annotations from Flybase r557 were applied
Candidate Gene Analysis
In conjunction with the GO analysis we also examinedknown candidate genes associated with survival under lowRH to better characterize possible biological functions of ourGWAS candidates We took a conservative approach to can-didate gene assignment where the SNPs were mapped withinthe entire gene region without up- and downstream exten-sions Genes were curated from FlyBase and the literature(table 1) predominantly using detailed functional experimen-tal evidence as criteria for functional desiccation candidates
Network Analysis
We next obtained a higher level summary of the candidategene list using PPI network analysis The full candidate genelist was used to construct a first-order interaction networkusing all Drosophila PPIs listed in the Drosophila InteractionDatabase v 2014_10 (avoiding interologs) (Murali et al 2011)This ldquofullrdquo PPI network includes manually curated data frompublished literature and experimental data for 9633 proteinsand 93799 interactions For the network construction theigraph R package was used to build a subgraph from the fullPPI network containing the candidate proteins and their first-order interacting proteins (Csardi and Nepusz 2006) Self-connections were removed Functional enrichment analysiswas conducted on the list of network genes using FlyMinev400 (wwwflymineorg last accessed January 13 2016) withthe following parameters for GO enrichment and KEGGReactome pathway enrichment BenjaminindashHochberg FDRcorrection P lt 005 background frac14 all D melanogastergenes in our full PPI network and Ontology (for GO enrich-ment only) frac14 biological process or molecular function
Cross-Study Comparison Overlap with Our PreviousWork
Following on from the comparison between our gene list andthe candidate genes suggested from the literature we quan-titatively evaluated the degree of overlap between the candi-date genes identified in this study with alleles mapped in ourprevious study of flies collected from the same geographic
12
region (Telonis-Scott et al 2012) The ldquooverlaprdquo analysis wasperformed similarly for the GWAS candidate gene and net-work-level approaches First the gene list overlap was testedbetween the two studies using Fisherrsquos exact test where the382 genes identified from the candidate SNPs (this study)were compared with 416 genes identified from candidatesingle feature polymorphisms (Telonis-Scott et al 2012)The contingency table was constructed using the numberof annotated genes in the D melanogaster R553 genomebuilt as an approximation for the total number of genestested in each study and took the following form
where A frac14 gene list from Telonis-Scott et al (2012)n frac14 416 B frac14 gene list from this study n frac14 382 N frac14 totalnumber of genes in D melanogaster reference genome r553n frac14 17106
To annotate genome features in the overlap candidategene list we used the SNP data (this study) which betterresolves alleles than array-based genotyping Due to this lim-itation we did not compare exact genomic coordinates be-tween the studies but considered overlap to occur whendifferentiated alleles were detected in the same gene inboth studies We did however examine the allele frequenciesaround the Dys-RF promoter which was sequenced sepa-rately revealing a selective sweep in the experimental evolu-tion lines (Telonis-Scott et al 2012)
The overlapping candidates were tested for functional en-richment using a basic gene list approach As genes ratherthan SNPs were analyzed here Fly Mine v400 (wwwflymineorg) was applied with the following parameters Gene lengthnormalization BenjaminindashHochberg FDR correction Plt 005Ontology frac14 biological process and default background (allGO-annotated D melanogaster genes)
Finally we constructed a second PPI network from thecandidate genes mapped in Telonis-Scott et al (2012) to ex-amine system-level overlap between the two studies The 416genes mapped to 337 protein seeds and the network wasconstructed as described above The degree of network over-lap between the two studies was tested using simulation Ineach iteration a gene set of the same length as the list fromthis study and a gene set of the same length as the list fromTelonis-Scott et al (2012) were resampled randomly from theentire D melanogaster mapped gene annotation (r553) Eachresampled set was then used to build a first-order network aspreviously described The overlap network between the tworesampled sets was calculated using the graphintersectioncommand from the igraph package and zero-degree nodes(proteins with no interactions) were removed Overlap wasquantified by counting the number of nodes and the numberof edges in this overlap network The observed overlap wascompared with a simulated null distribution for both nodeand edge number where n frac14 1000 simulation iterations
Supplementary MaterialSupplementary figures S1ndashS4 and tables S1ndashS5 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
We thank Robert Good for the fly sampling and JenniferShirriffs and Lea and Alan Rako for outstanding assistancewith laboratory culture and substantial desiccation assaysWe are grateful to Melissa Davis for advice on network build-ing and comparison approaches and to five anonymous re-viewers whose comments improved the manuscript Thiswork was supported by a DECRA fellowship DE120102575and Monash University Technology voucher scheme toMTS an ARC Future Fellowship and Discovery Grants toCMS an ARC Laureate Fellowship to AAH and a Scienceand Industry Endowment Fund grant to CMS and AAH
ReferencesAlexander JM Lomvardas S 2014 Nuclear architecture as an epigenetic
regulator of neural development and function Neuroscience26439ndash50
Anderson AR Hoffmann AA McKechnie SW Umina PA Weeks AR2005 The latitudinal cline in the In(3R)Payne inversion polymor-phism has shifted in the last 20 years in Australian Drosophilamelanogaster populations Mol Ecol 14851ndash858
Andolfatto P Wall JD Kreitman M 1999 Unusual haplotype structureat the proximal breakpoint of In(2L)t in a natural population ofDrosophila melanogaster Genetics 1531297ndash1311
Andrews S 2014 FastQC Available from httpwwwbioinformaticsbbsrcacukprojectsfastqc
Ashburner M Ball CA Blake JA Botstein D Butler H Cherry JM DavisAP Dolinski K Dwight SS Eppig JT et al 2000 Gene ontology toolfor the unification of biology The Gene Ontology Consortium NatGenet 2525ndash29
Aytes A Mitrofanova A Lefebvre C Alvarez Mariano J Castillo-MartinM Zheng T Eastham James A Gopalan A Pienta Kenneth J ShenMM et al 2014 Cross-species regulatory network analysis identifiesa synergistic interaction between FOXM1 and CENPF that drivesprostate cancer malignancy Cancer Cell 25638ndash651
Bastide H Betancourt A Nolte V Tobler R Steuroobe P Futschik ASchleurootterer C 2013 A genome-wide fine-scale map of natural pig-mentation variation in Drosophila melanogaster PLoS Genet9e1003534
Bergland AO Behrman EL OrsquoBrien KR Schmidt PS Petrov DA 2014Genomic evidence of rapid and stable adaptive oscillations overseasonal time scales in Drosophila PLoS Genet 10e1004775
Bonebrake TC Mastrandrea MD 2010 Tolerance adaptation and pre-cipitation changes complicate latitudinal patterns of climate changeimpacts Proc Natl Acad Sci U S A 10712581ndash12586
Bradley TJ 2009 Animal osmoregulation Oxford Oxford UniversityPress
Burke MK Liti G Long AD 2014 Standing genetic variation drivesrepeatable experimental evolution in outcrossing populations ofSaccharomyces cerevisiae Mol Biol Evol 313228ndash3239
Chown SL 2012 Trait-based approaches to conservation physiologyforecasting environmental change risks from the bottom upPhilos Trans R Soc B 3671615ndash1627
Chown SL Nicolson SW 2004 Insect physiological ecology mechanismsand patterns Oxford Oxford University Press
Chown SL Soslashrensen JG Terblanche JS 2011 Water loss in insectsan environmental change perspective J Insect Physiol571070ndash1084
Not in GWAS In GWAS
Not in Telonis-Scott et al (2012) N ndash intersect-AndashB A-intersect
In Telonis-Scott et al (2012) B-intersect Intersect(AB)
13
Chung H Loehlin DW Dufour HD Vaccarro K Millar JG Carroll SB2014 A single gene affects both ecological divergence and matechoice in Drosophila Science 3431148ndash1151
Cingolani P Platts A Wang LL Coon M Nguyen T Wang L Land SJ LuX Ruden DM 2012 A program for annotating and predicting theeffects of single nucleotide polymorphisms SnpEff SNPs in thegenome of Drosophila melanogaster strain w1118 iso-2 iso-3 Fly(Austin) 680ndash92
Civelek M Lusis AJ 2014 Systems genetics approaches to understandcomplex traits Nat Rev Genet 1534ndash48
Clusella-Trullas S Blackburn TM Chown SL 2011 Climatic predictors oftemperature performance curve parameters in ectotherms implycomplex responses to climate change Am Nat 177738ndash751
Csardi G Nepusz T 2006 The igraph software package for complexnetwork research InterJournal Complex Systems 1695 Availablefrom httpigraphorg
Davies SA Cabrero P Overend G Aitchison L Sebastian S Terhzaz SDow JA 2014 Cell signalling mechanisms for insect stress toleranceJ Exp Biol 217119ndash128
Davies SA Cabrero P Povsic M Johnston NR Terhzaz S Dow JA 2013Signaling by Drosophila capa neuropeptides Gen Comp Endocrinol18860ndash66
Davies SA Overend G Sebastian S Cundall M Cabrero P Dow JATerhzaz S 2012 Immune and stress response lsquocross-talkrsquo in theDrosophila Malpighian tubule J Insect Physiol 58488ndash497
Day JP Dow JA Houslay MD Davies SA 2005 Cyclic nucleotide phos-phodiesterases in Drosophila melanogaster Biochem J 388333ndash342
de Nadal E Ammerer G Posas F 2011 Controlling gene expression inresponse to stress Nat Rev Genet 12833ndash845
DePristo MA Banks E Poplin R Garimella KV Maguire JR Hartl CPhilippakis AA del Angel G Rivas MA Hanna M et al 2011 Aframework for variation discovery and genotyping using next-gen-eration DNA sequencing data Nat Genet 43491ndash498
Djawdan M Chippindale AK Rose MR Bradley TJ 1998 Metabolic re-serves and evolved stress resistance in Drosophila melanogasterPhysiol Zool 71584ndash594
Durham MF Magwire MM Stone EA Leips J 2014 Genome-wide anal-ysis in Drosophila reveals age-specific effects of SNPs on fitness traitsNat Commun 54338
Emilsson V Thorleifsson G Zhang B Leonardson AS Zink F Zhu JCarlson S Helgason A Walters GB Gunnarsdottir S et al 2008Genetics of gene expression and its effect on disease Nature452423ndash428
Fabian DK Kapun M Nolte V Kofler R Schmidt PS Schleurootterer C FlattT 2012 Genome-wide patterns of latitudinal differentiation amongpopulations of Drosophila melanogaster from North America MolEcol 214748ndash4769
Foley BR Telonis-Scott M 2011 Quantitative genetic analysis suggestscausal association between cuticular hydrocarbon composition anddesiccation survival in Drosophila melanogaster Heredity 10668ndash77
Folk DG Han C Bradley TJ 2001 Water acquisition and partitioning inDrosophila melanogaster effects of selection for desiccation-resis-tance J Exp Biol 2043323ndash3331
Fowler MA Montell C 2013 Drosophila TRP channels and animal be-havior Life Sci 92394ndash403
Fu J Keurentjes JJB Bouwmeester H America T Verstappen FWA WardJL Beale MH de Vos RCH Dijkstra M Scheltema RA et al 2009System-wide molecular evidence for phenotypic buffering inArabidopsis Nat Genet 41166ndash167
Futschik A Schleurootterer C 2010 The next generation of molecular mar-kers from massively parallel sequencing of pooled DNA samplesGenetics 186207ndash218
Gibbs AG Fukuzato F Matzkin LM 2003 Evolution of water conserva-tion mechanisms in Drosophila J Exp Biol 2061183ndash1192
Gibbs AG Matzkin LM 2001 Evolution of water balance in the genusDrosophila J Exp Biol 2042331ndash2338
Goodstadt L 2010 Ruffus a lightweight Python library for computa-tional pipelines Bioinformatics 262778ndash2779
Hadley NF 1994 Water relations of terrestrial arthropods San Diego(CA) Academic Press
Hoffmann AA Hallas R Sinclair C Mitrovski P 2001 Levels of variationin stress resistance in Drosophila among strains local populationsand geographic regions patterns for desiccation starvation coldresistance and associated traits Evolution 551621ndash1630
Hoffmann AA Parsons PA 1989a An integrated approach to environ-mental stress tolerance and life-history variation desiccation toler-ance in Drosophila Biol J Linn Soc 37117ndash136
Hoffmann AA Parsons PA 1989b Selection for increased desiccationresistance in Drosophila melanogaster additive genetic control andcorrelated responses for other stresses Genetics 122837ndash845
Hoffmann AA Sgro CM 2011 Climate change and evolutionary adap-tation Nature 470479ndash485
Hoffmann AA Weeks AR 2007 Climatic selection on genes and traitsafter a 100 year-old invasion a critical look at the temperate-tropicalclines in Drosophila melanogaster from eastern Australia Genetica129133ndash147
Huang W Richards S Carbone MA Zhu D Anholt RR Ayroles JFDuncan L Jordan KW Lawrence F Magwire MM et al 2012Epistasis dominates the genetic architecture of Drosophila quanti-tative traits Proc Natl Acad Sci U S A 10915553ndash15559
Huang Z Tunnacliffe A 2004 Response of human cells to desiccationcomparison with hyperosmotic stress response J Physiol558181ndash191
Huang Z Tunnacliffe A 2005 Gene induction by desiccation stress inhuman cell cultures FEBS Lett 5794973ndash4977
Ichihashi Y Aguilar-Martınez JA Farhi M Chitwood DH Kumar RMillon LV Peng J Maloof JN Sinha NR 2014 Evolutionary develop-mental transcriptomics reveals a gene network module regulatinginterspecific diversity in plant leaf shape Proc Natl Acad Sci U S A111E2616ndashE2621
Itoh M Nanba N Hasegawa M Inomata N Kondo R Oshima MTakano-Shimizu T 2010 Seasonal changes in the long-distance link-age disequilibrium in Drosophila melanogaster J Hered 10126ndash32
Johnson WA Carder JW 2012 Drosophila nociceptors mediate larvalaversion to dry surface environments utilizing both the painless TRPchannel and the DEGENaC subunit PPK1 PLoS One 7e32878
Kahsai L Kapan N Dircksen H Winther AM Nassel DR 2010 Metabolicstress responses in Drosophila are modulated by brain neurosecre-tory cells that produce multiple neuropeptides PLoS One 5e11480
Kapun M van Schalkwyk H McAllister B Flatt T Schleurootterer C 2014Inference of chromosomal inversion dynamics from Pool-Seq datain natural and laboratory populations of Drosophila melanogasterMol Ecol 231813ndash1827
Kawano T Shimoda M Matsumoto H Ryuda M Tsuzuki S Hayakawa Y2010 Identification of a gene Desiccate contributing to desicca-tion resistance in Drosophila melanogaster J Biol Chem28538889ndash38897
Kellermann V van Heerwaarden B Sgro CM Hoffmann AA 2009Fundamental evolutionary limits in ecological traits driveDrosophila species distributions Science 3251244ndash1246
Kofler R Nolte V Schleurootterer C 2016 The impact of library preparationprotocols on the consistency of allele frequency estimates in Pool-Seq data Mol Ecol Resour 16118ndash122
Kofler R Pandey RV Schleurootterer C 2011 PoPoolation2 identifying dif-ferentiation between populations using sequencing of pooled DNAsamples (Pool-Seq) Bioinformatics 273435ndash3436
Kofler R Schleurootterer C 2012 Gowinda unbiased analysis of gene setenrichment for genome-wide association studies Bioinformatics282084ndash2085
Leiserson MDM Eldridge JV Ramachandran S Raphael BJ 2013Network analysis of GWAS data Curr Opin Genet Dev 23602ndash610
Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 251754ndash1760
Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079
14
Liu Y Luo J Carlsson MA Neuroassel DR 2015 Serotonin and insulin-likepeptides modulate leucokinin-producing neurons that affectfeeding and water homeostasis in Drosophila J Comp Neurol5231840ndash1863
Lohse M Bolger AM Nagel A Fernie AR Lunn JE Stitt M Usadel B 2012RobiNA a user-friendly integrated software solution for RNA-Seq-based transcriptomics Nucleic Acids Res 40W622ndashW627
Long A Liti G Luptak A Tenaillon O 2015 Elucidating the moleculararchitecture of adaptation via evolve and resequence experimentsNat Rev Genet 16567ndash582
Lynch M Bost D Wilson S Maruki T Harrison S 2014 Population-genetic inference from pooled-sequencing data Genome Biol Evol61210ndash1218
Mackay TFC Stone EA Ayroles JF 2009 The genetics of quantitativetraits challenges and prospects Nat Rev Genet 10565ndash577
Marron MT Markow TA Kain KJ Gibbs AG 2003 Effects of starvationand desiccation on energy metabolism in desert and mesicDrosophila J Insect Physiol 49261ndash270
Matzkin LM Markow TA 2009 Transcriptional regulation of metabo-lism associated with the increased desiccation resistance of thecactophilic Drosophila mojavensis Genetics 1821279ndash1288
Murali T Pacifico S Yu J Guest S Roberts GG Finley RL Jr 2011 DroID2011 a comprehensive integrated resource for protein transcrip-tion factor RNA and gene interactions for Drosophila Nucleic AcidsRes 39D736ndashD743
Nuzhdin SV Turner TL 2013 Promises and limitations of hitchhikingmapping Curr Opin Genet Dev 23694ndash699
Orozco-terWengel P Kapun M Nolte V Kofler R Flatt T Schleurootterer C2012 Adaptation of Drosophila to a novel laboratory environmentreveals temporally heterogeneous trajectories of selected alleles MolEcol 214931ndash4941
Picard Picard A set of Java command line tools for manipulating high-throughput seqeuncing (HTS) data and formats Available fromhttpbroadinstitutegithubiopicard
Qiu Y Tittiger C Wicker-Thomas C Le Goff G Young S Wajnberg EFricaux T Taquet N Blomquist GJ Feyereisen R 2012 An insect-specific P450 oxidative decarbonylase for cuticular hydrocarbon bio-synthesis Proc Natl Acad Sci U S A 10914858ndash14863
Rajpurohit S Oliveira CC Etges WJ Gibbs AG 2013 Functional genomicand phenotypic responses to desiccation in natural populations of adesert Drosophilid Mol Ecol 222698ndash2715
Sarup P Soslashrensen JG Kristensen TN Hoffmann AA Loeschcke V PaigeKN Soslashrensen P 2011 Candidate genes detected in transcriptomestudies are strongly dependent on genetic background PLoS One6e15644
Schiffer M Hangartner S Hoffmann AA 2013 Assessing the relativeimportance of environmental effects carry-over effects and speciesdifferences in thermal stress resistance a comparison ofDrosophilids across field and laboratory generations J Exp Biol2163790ndash3798
Schleurootterer C Kofler R Versace E Tobler R Franssen SU 2015Combining experimental evolution with next-generation sequen-cing A powerful tool to study adaptation from standing geneticvariation Heredity 114431ndash440
Sinclair BJ Gibbs AG Roberts SP 2007 Gene transcription during ex-posure to and recovery from cold and desiccation stress inDrosophila melanogaster Insect Mol Biol 16435ndash443
Sloggett C Wakefield M Philip G Pope B 2014 Rubramdashflexible distrib-uted pipelines figshare Available from httpdxdoiorg106084m9figshare895626
Smit AFA Hubley R Green P 2013-2015 RepeatMasker Open-403Available from httpwwwrepeatmaskerorg
Seurooderberg JA Birse RT Neuroassel DR 2011 Insulin production and signalingin renal tubules of Drosophila is under control of tachykinin-relatedpeptide and regulates stress resistance PLoS One 6e19866
Soslashrensen JG Nielsen MM Loeschcke V 2007 Gene expression profileanalysis of Drosophila melanogaster selected for resistance to envi-ronmental stressors J Evol Biol 201624ndash1636
St Pierre SE Ponting L Stefancsik R McQuilton P 2014 FlyBase102mdashadvanced approaches to interrogating FlyBase Nucleic AcidsRes 42D780ndashD788
Stevenson M Nunes T Heuser C Marshall J Sanchez J Thornton RReiczigel J Robison-Cox J Sebastiani P Solymos P et al 2015 epiRTools for the analysis of epidemiological data R package version09-62 Available from httpCRANR-projectorgpackagefrac14epiR
Storey JD Tibshirani R 2003 Statistical significance for genomewidestudies Proc Natl Acad Sci U S A 1009440ndash9445
Telonis-Scott M Gane M DeGaris S Sgro CM Hoffmann AA 2012 Highresolution mapping of candidate alleles for desiccation resistancein Drosophila melanogaster under selection Mol Biol Evol291335ndash1351
Telonis-Scott M Guthridge KM Hoffmann AA 2006 A new set oflaboratory-selected Drosophila melanogaster lines for the analysisof desiccation resistance response to selection physiology and cor-related responses J Exp Biol 2091837ndash1847
Terhzaz S Cabrero P Robben JH Radford JC Hudson BD Milligan GDow JA Davies SA 2012 Mechanism and function of Drosophilacapa GPCR a desiccation stress-responsive receptor with functionalhomology to human neuromedinU receptor PLoS One 7e29897
Terhzaz S Overend G Sebastian S Dow JA Davies SA 2014 The Dmelanogaster capa-1 neuropeptide activates renal NF-kB signalingPeptides 53218ndash224
Terhzaz S Teets NM Cabrero P Henderson L Ritchie MG Nachman RJDow JA Denlinger DL Davies SA 2015 Insect capa neuropeptidesimpact desiccation and cold tolerance Proc Natl Acad Sci U S A1122882ndash2887
Thorat LJ Gaikwad SM Nath BB 2012 Trehalose as an indicator ofdesiccation stress in Drosophila melanogaster larvae a potentialmarker of anhydrobiosis Biochem Biophys Res Commun419638ndash642
Tobler R Franssen SU Kofler R Orozco-terWengel P Nolte VHermisson J Schl eurootterer C 2014 Massive habitat-specific ge-nomic response in D melanogaster populations during exper-imental evolution in hot and cold environments Mol Biol Evol31364ndash375
Turner TL Bourne EC Von Wettberg EJ Hu TT Nuzhdin SV 2010Population resequencing reveals local adaptation of Arabidopsislyrata to serpentine soils Nat Genet 42260ndash263
Turner TL Miller PM 2012 Investigating natural variation in Drosophilacourtship song by the evolve and resequence approach Genetics191633ndash642
Vermeulen CJ Soslashrensen P Kirilova Gagalova K Loeschcke V 2013Transcriptomic analysis of inbreeding depression in cold-sensitiveDrosophila melanogaster shows upregulation of the immune re-sponse J Evol Biol 261890ndash1902
Wang B Goode J Best J Meltzer J Schilman PE Chen J Garza D ThomasJB Montminy M 2008 The insulin-regulated CREB coactivatorTORC promotes stress resistance in Drosophila Cell Metab7434ndash444
Wang J Kean L Yang J Allan AK Davies SA Herzyk P Dow JA 2004Function-informed transcriptome analysis of Drosophila renaltubule Genome Biol 5R69
Wang K Li M Hakonarson H 2010 Analysing biological pathways ingenome-wide association studies Nat Rev Genet 11843ndash854
Wang L Jia P Wolfinger RD Chen X Zhao Z 2011 Gene set analysis ofgenome-wide association studies methodological issues and per-spectives Genomics 981ndash8
Weber AL Khan GF Magwire MM Tabor CL Mackay TF Anholt RR2012 Genome-wide association analysis of oxidative stress resistancein Drosophila melanogaster PLoS One 7e34745
Weeks AR McKechnie SW Hoffmann AA 2002 Dissecting adaptiveclinal variation markers inversions and sizestress associations inDrosophila melanogaster from a central field population Ecol Lett5756ndash763
Zhu Y Bergland AO Gonzalez J Petrov DA 2012 Empirical validation ofpooled whole genome population re-sequencing in Drosophila mel-anogaster PLoS One 7e41901
15
Genetic Overlap Versus Genetic differencesmdashCausesand Implications
Despite the common signatures at the gene and higher orderlevels between our studies pervasive differences were ob-served These can arise from differences in standing geneticvariation in the natural populations where different alleliccompositions impact gene-by-environment interactions andepistasis Our network overlap analyses highlight the potentialfor different molecular trajectories to result in similar func-tions (Leiserson et al 2013) consistent with the multiplephysiological pathways potentially available to D melanoga-ster under water stress (Chown et al 2011) Experimentaldesign also presumably contributed to the study differencesStrong directional selection as applied in the 2012 study canincrease the frequency of rare beneficial alleles and likely re-duced overall diversity than did the single-generation GWASdesign Furthermore selection regimes often impose addi-tional selective pressures (eg food deprivation during desic-cation) resulting in correlated responses such as starvationresistance with desiccation resistance and these factors differfrom the laboratory to the field Finally evidence suggests innatural D melanogaster that standing genetic variation canvary extensively with sampling season (Itoh et al 2010Bergland et al 2014) Although we cannot compare levelsof standing variation between our two studies evidence sug-gests that this may have affected the Dys gene where ourGWAS approach revealed no differentiation between thecontrol and resistant pools at this locus but rather detectedthe alleles associated with the selective sweep in Telonis-Scottet al (2012) in both pools at high frequency Whether thisresulted from sampling the later collection after a longdrought (2004ndash2008 wwwbomgovauclimate last accessedJanuary 13 2016) is unclear but this observation highlightsthe relevance of temporal sampling to standing genetic var-iation when attempting to link complex phenotypes togenotypes
Materials and Methods
Natural Population Sampling
Drosophila melanogaster was collected from vineyard waste atKilchorn Winery Romsey Australia in April 2010 Over 1200isofemale lines were established in the laboratory from wild-caught females Species identification was conducted on maleF1 to remove Drosophila simulans contamination and over900 D melanogaster isofemale lines were retained All flieswere maintained in vials of dextrose cornmeal media at25 C and constant light
Desiccation Assay
From each isofemale line ten inseminated F1 females weresorted by aspiration without CO2 and held in vials until 4ndash5days old For the desiccation assay 10 females per line (over9000 flies in total) were screened by placing groups of femalesinto empty vials covered with gauze and randomly assigninglines into 5 large sealed glass tanks containing trays of silicadesiccant (relative humidity [RH] lt10) The tanks werescored for mortality hourly and the final 558 surviving
individuals were selected for the desiccation-resistant pool(558 flies lt6 of the total) The control pool constitutedthe same number of females sampled from over 200 ran-domly chosen isofemale lines that were frozen prior to thedesiccation assay We chose this random control approachinstead of comparing the phenotypic extremes (ie the latemortality tail vs the ldquoearly mortalityrdquo tail) because unfavor-able environmental conditions experienced by the wild-caught dam may inflate environmental variance due to car-ryover effects in the F1 offspring (Schiffer et al 2013)
DNA Extraction and Sequencing
We used pooled whole-genome sequencing (Pool-seq)(Futschik and Schleurootterer 2010) to estimate and compareallele frequency differences between the most desiccation-resistant and randomly sampled flies from a wild populationto associate naturally segregating candidate SNPs with theresponse to low humidity For each treatment (resistantand control) genomic DNA was extracted from 50 femaleheads using the Qiagen DNeasy Blood and Tissue Kit accord-ing to the manufacturerrsquos instructions (Qiagen HildenGermany) Two extractions were combined to create fivepools per treatment which were barcoded separately tocreate technical replicates (total 10 pools of 100 fliesn frac14 1000) The DNA was fragmented using a CovarisS2 ma-chine (Covaris Inc Woburn MA) and libraries were pre-pared from 1 mg genomic DNA (per pool of 100 flies) usingthe Illumina TruSeq Prep module (Illumina San Diego CA)To minimize the effect of variation across lanes the ten poolswere combined in equal concentration (fig 3) and run mul-tiplexed in each lane of a full flow cell of an Illumina HiSeq2000 sequencer Clusters were generated using the TruSeq PECluster Kit v5 on a cBot and sequenced using Illumina TruSeqSBS v3 chemistry (Illumina) Fragmentation library construc-tion and sequencing were performed at the Micromon Next-Gen Sequencing facility (Monash University ClaytonAustralia)
Fig 3 Experimental design See text for further explanation
10
Data Processing and Mapping
Processing was performed on a high-performance computingcluster using the Rubra pipeline system that makes use of theRuffus Python library (Goodstadt 2010) The final variant-calling pipeline was based on a pipeline developed by ClareSloggett (Sloggett et al 2014) and is publicly available athttpsgithubcomgriffinpGWAS_pipeline (last accessedJanuary 13 2016) Raw sequence reads were processed persample per lane with P1 and P2 read files processed sepa-rately initially Trimming was performed with trimmomatic v030 (Lohse et al 2012) Adapter sequences were also re-moved Leading and trailing bases with a quality scorebelow 30 were trimmed and a sliding window (width frac14 10quality threshold frac14 25) was used Reads shorter than 40 bpafter trimming were discarded Quality before and after trim-ming was examined using FastQC v 0101 (Andrews 2014)
Posttrimming reads that remained paired were used asinput into the alignment step Alignment to the D melano-gaster reference genome version r553 was performed withbwa v 062 (Li and Durbin 2009) using the bwa aln commandand the following options No seeding (-l 150) 1 rate ofmissing alignments given 2 uniform base error rate (-n001) a maximum of 2 gap opens per read (-o 2) a maximumof 12 gap extensions per read (-e 12) and long deletionswithin 12 bp of 30 end disallowed (-d 12) The default valueswere used for all other options Output files were then con-verted to sorted bam files using the bwa sampe commandand the ldquoSortSamrdquo option in Picard v 196
At this point the bam files were merged into one file persample using PicardMerge Duplicates were identified withPicard MarkDuplicates The tool RealignerTargetCreator inGATK v 26-5 (DePristo et al 2011) was applied to the tenbam files simultaneously producing a list of intervals contain-ing probable indels around which local realignment would beperformed This was used as input for the IndelRealigner toolwhich was also applied to all ten bam files simultaneously
To investigate the variation among technical replicatesallele frequency was estimated based on read counts foreach technical replicate separately for 100000 randomly cho-sen SNPs across the genome After excluding candidate des-iccation-resistance SNPs and SNPs that may have been falsepositives due to sequencing error (those with mean frequencylt004 across replicates) 1803 SNPs remained For each locuswe calculated the variance in allele frequency estimate amongthe five control replicates and the variance among the fivedesiccation-resistance replicates We compared this with thevariance due to pool category (control vs desiccation resis-tant) using analysis of variance (ANOVA) We also calculatedthe mean pairwise difference in allele frequency estimateamong control and among desiccation-resistance replicatesand the mean concordance correlation coefficient was calcu-lated over all pairwise comparisons within pool category(using the epiccc function in the epiR package Stevensonet al 2015)
The following approach was then taken to identify SNPsdiffering between the desiccation-resistant and controlgroups Based on the technical replicate analysis we merged
the five ldquodesiccation-resistantrdquo and the five random controlsamples into two bam files A pileup file was created withsamtools v 0119 (Li et al 2009) and converted to a ldquosyncrdquo fileusing the mpileup2sync script in Popoolation2 v 1201 (Kofleret al 2011) retaining reads with a mapping quality of at least20 and bases with a minimum quality score of 20 Reads withunmapped mates were also retained (option ndashA in samtoolsmpileup) For this step we used a repeat-masked version ofthe reference genome created with RepeatMasker 403 (Smitet al 2013-2015) and a repeat file containing all annotatedtransposons from D melanogaster including shared ancestralsequences (Flybase release r557) plus all repetitive elementsfrom RepBase release v1904 for D melanogaster Simple re-peats small RNA genes and bacterial insertions were notmasked (options ndashnolow ndashnorna ndashno_is) The rmblastsearch engine was used
Association Analysis
We then performed Fisherrsquos exact test with the Fisher testscript in Popoolation2 (Kofler et al 2011) We set the min-imum count (of the minor allele) frac14 10 minimum coverage(in both populations) frac14 20 and maximum coverage frac1410000 The test was performed for each SNP (slidingwindow off) Although there are numerous approaches tocalculating genome-wide significance thresholds in aPoolseq framework currently there is no consensus on anappropriate standardized statistical method Methods in-clude simulation approaches (Orozco-terWengel et al2012 Bastide et al 2013 Burke et al 2014) correctionusing a null distribution based on estimating genetic driftfrom replicated artificial selection line allele frequency vari-ance (Turner and Miller 2012) use of a simple thresholdbased on a minimum allele frequency difference (Turneret al 2010) and the FDR correction suggested by Storeyand Tibshirani (2003) (used by Fabian et al 2012)
We determined based on our design to calculate the FDRthreshold by comparison with a simulated distribution of nullP values following the approach of Bastide et al (2013) Foreach SNP locus a P value was calculated by simulating thedistribution of read counts between the major and minorallele according to a beta-binomial distribution with meana(a+b) and variance (ab)((a+b)2(a+b+1)) Thismodel allowed for two types of sampling variationStochastic variation in allele sampling across the phenotypesand the background variation in the representation of allelesin each pool caused by library preparation artifacts or sto-chastic variation driven by unequal coverage of the poolsThese sources of variation have also been recognized else-where as being important in interpreting pooled sequenceallele calls and identifying significant differentiation betweenpooled samples (Lynch et al 2014) The value of afrac14 37 waschosen by chi-square test as the best match to the observedP value distribution (afrac14 10 to afrac14 40 were tested) with acorresponding bfrac14 4315 to reflect the coverage depth differ-ence between the random control (487) and desiccation-resistant (568) pools These values were then used tosimulate a null data set 10 the size of the observed data
11
set Because the null P value distribution still did not fit theobserved P value distribution particularly well (supplemen-tary fig S2 Supplementary Material online) these values werenot used for a standard FDR correction Instead a P valuethreshold was calculated based on the lowest 005 of thenull P values similar in approach to Orozco-terWengel et al(2012)
Inversion Analysis
In D melanogaster several cosmopolitan inversion polymor-phisms show cross-continent latitudinal patterns that explaina substantial proportion of clinal variation for many traitsincluding thermal tolerance (reviewed in Hoffmann andWeeks 2007) Patterns of disequilibria generated betweenloci in the vicinity of the rearrangement can obscure signalsof selection (Hoffmann and Weeks 2007) and inversion fre-quencies are routinely considered in genomic studies ofnatural populations sampled from known latitudinal clines(Fabian et al 2012 Kapun et al 2014) We assessedassociations between inversion frequencies and our controland resistant pools at SNPs diagnostic for the following in-versions In3R(Payne) (Anderson et al 2005 Kapun et al2014) In(2L)t (Andolfatto et al 1999 Kapun et al 2014)In(2R)Ns In(3L)P In(3R)C In(3R)K and In(3R)Mo (Kapunet al 2014) Between one and five diagnostic SNPs were in-vestigated for each inversion (depending on availability) Weconsidered an inversion to be associated with desiccationresistance if the frequency of its diagnostic SNP differed sig-nificantly between the control and desiccation-resistant poolsby a Fisherrsquos exact test
Genome Feature Annotation
The list of candidate SNPs was converted to vcf format using acustom R script setting the alternate allele as the allele thatincreased in frequency from the control to the desiccation-resistant pool and setting the reference allele as the otherallele present in cases where both the major and minor allelesdiffered from the reference genome The genomic features inwhich the candidate SNPs occurred were annotated usingsnpEff v40 first creating a database from the D melanogasterv553 genome annotation with the default approach(Cingolani et al 2012) The default annotation settings wereused except for setting the parameter ldquo-ud 0rdquo to annotateSNPs to genes only if they were within the gene body Using2 tests on SNP counts we tested for over- or under-enrichment of feature types by calculating the proportionof lsquoallrsquo features from all SNPs detected in the study and theproportion of ldquocandidaterdquo SNPs identified as differentiated indesiccation-resistant flies (Fabian et al 2012)
GO Analysis
GO analyses was used to associate the candidates with theirbiological functions (Ashburner et al 2000) Previously devel-oped for biological pathway analysis of global gene expressionstudies (Wang et al 2010) typical GO analysis assumes that allgenes are sampled independently with equal probabilityGWAS violate these criteria where SNPs are more likely to
occur in longer genes than in shorter genes resulting in highertype I error rates in GO categories overpopulated with longgenes Gene length as well as overlapping gene biases (Wanget al 2011) were accounted for using Gowinda (Kofler andSchleurootterer 2012) which implements a permutation strategyto reduce false positives Gowinda does not reconstruct LDbetween SNPs but operates in two modes underpinned bytwo extreme assumptions 1) Gene mode (assumes all SNPs ina gene are in LD) and 2) SNP mode (assumes all SNPs in agene are completely independent)
The candidates were examined in both modes but giventhe stringency of gene mode high-resolution SNP modeoutput was utilized for the final analysis with the followingparameters 106 simulations minimum significance frac141 andminimum genes per GO categoryfrac14 3 (to exclude gene-poorcategories) P values were corrected using the Gowinda FDRalgorithm GO annotations from Flybase r557 were applied
Candidate Gene Analysis
In conjunction with the GO analysis we also examinedknown candidate genes associated with survival under lowRH to better characterize possible biological functions of ourGWAS candidates We took a conservative approach to can-didate gene assignment where the SNPs were mapped withinthe entire gene region without up- and downstream exten-sions Genes were curated from FlyBase and the literature(table 1) predominantly using detailed functional experimen-tal evidence as criteria for functional desiccation candidates
Network Analysis
We next obtained a higher level summary of the candidategene list using PPI network analysis The full candidate genelist was used to construct a first-order interaction networkusing all Drosophila PPIs listed in the Drosophila InteractionDatabase v 2014_10 (avoiding interologs) (Murali et al 2011)This ldquofullrdquo PPI network includes manually curated data frompublished literature and experimental data for 9633 proteinsand 93799 interactions For the network construction theigraph R package was used to build a subgraph from the fullPPI network containing the candidate proteins and their first-order interacting proteins (Csardi and Nepusz 2006) Self-connections were removed Functional enrichment analysiswas conducted on the list of network genes using FlyMinev400 (wwwflymineorg last accessed January 13 2016) withthe following parameters for GO enrichment and KEGGReactome pathway enrichment BenjaminindashHochberg FDRcorrection P lt 005 background frac14 all D melanogastergenes in our full PPI network and Ontology (for GO enrich-ment only) frac14 biological process or molecular function
Cross-Study Comparison Overlap with Our PreviousWork
Following on from the comparison between our gene list andthe candidate genes suggested from the literature we quan-titatively evaluated the degree of overlap between the candi-date genes identified in this study with alleles mapped in ourprevious study of flies collected from the same geographic
12
region (Telonis-Scott et al 2012) The ldquooverlaprdquo analysis wasperformed similarly for the GWAS candidate gene and net-work-level approaches First the gene list overlap was testedbetween the two studies using Fisherrsquos exact test where the382 genes identified from the candidate SNPs (this study)were compared with 416 genes identified from candidatesingle feature polymorphisms (Telonis-Scott et al 2012)The contingency table was constructed using the numberof annotated genes in the D melanogaster R553 genomebuilt as an approximation for the total number of genestested in each study and took the following form
where A frac14 gene list from Telonis-Scott et al (2012)n frac14 416 B frac14 gene list from this study n frac14 382 N frac14 totalnumber of genes in D melanogaster reference genome r553n frac14 17106
To annotate genome features in the overlap candidategene list we used the SNP data (this study) which betterresolves alleles than array-based genotyping Due to this lim-itation we did not compare exact genomic coordinates be-tween the studies but considered overlap to occur whendifferentiated alleles were detected in the same gene inboth studies We did however examine the allele frequenciesaround the Dys-RF promoter which was sequenced sepa-rately revealing a selective sweep in the experimental evolu-tion lines (Telonis-Scott et al 2012)
The overlapping candidates were tested for functional en-richment using a basic gene list approach As genes ratherthan SNPs were analyzed here Fly Mine v400 (wwwflymineorg) was applied with the following parameters Gene lengthnormalization BenjaminindashHochberg FDR correction Plt 005Ontology frac14 biological process and default background (allGO-annotated D melanogaster genes)
Finally we constructed a second PPI network from thecandidate genes mapped in Telonis-Scott et al (2012) to ex-amine system-level overlap between the two studies The 416genes mapped to 337 protein seeds and the network wasconstructed as described above The degree of network over-lap between the two studies was tested using simulation Ineach iteration a gene set of the same length as the list fromthis study and a gene set of the same length as the list fromTelonis-Scott et al (2012) were resampled randomly from theentire D melanogaster mapped gene annotation (r553) Eachresampled set was then used to build a first-order network aspreviously described The overlap network between the tworesampled sets was calculated using the graphintersectioncommand from the igraph package and zero-degree nodes(proteins with no interactions) were removed Overlap wasquantified by counting the number of nodes and the numberof edges in this overlap network The observed overlap wascompared with a simulated null distribution for both nodeand edge number where n frac14 1000 simulation iterations
Supplementary MaterialSupplementary figures S1ndashS4 and tables S1ndashS5 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
We thank Robert Good for the fly sampling and JenniferShirriffs and Lea and Alan Rako for outstanding assistancewith laboratory culture and substantial desiccation assaysWe are grateful to Melissa Davis for advice on network build-ing and comparison approaches and to five anonymous re-viewers whose comments improved the manuscript Thiswork was supported by a DECRA fellowship DE120102575and Monash University Technology voucher scheme toMTS an ARC Future Fellowship and Discovery Grants toCMS an ARC Laureate Fellowship to AAH and a Scienceand Industry Endowment Fund grant to CMS and AAH
ReferencesAlexander JM Lomvardas S 2014 Nuclear architecture as an epigenetic
regulator of neural development and function Neuroscience26439ndash50
Anderson AR Hoffmann AA McKechnie SW Umina PA Weeks AR2005 The latitudinal cline in the In(3R)Payne inversion polymor-phism has shifted in the last 20 years in Australian Drosophilamelanogaster populations Mol Ecol 14851ndash858
Andolfatto P Wall JD Kreitman M 1999 Unusual haplotype structureat the proximal breakpoint of In(2L)t in a natural population ofDrosophila melanogaster Genetics 1531297ndash1311
Andrews S 2014 FastQC Available from httpwwwbioinformaticsbbsrcacukprojectsfastqc
Ashburner M Ball CA Blake JA Botstein D Butler H Cherry JM DavisAP Dolinski K Dwight SS Eppig JT et al 2000 Gene ontology toolfor the unification of biology The Gene Ontology Consortium NatGenet 2525ndash29
Aytes A Mitrofanova A Lefebvre C Alvarez Mariano J Castillo-MartinM Zheng T Eastham James A Gopalan A Pienta Kenneth J ShenMM et al 2014 Cross-species regulatory network analysis identifiesa synergistic interaction between FOXM1 and CENPF that drivesprostate cancer malignancy Cancer Cell 25638ndash651
Bastide H Betancourt A Nolte V Tobler R Steuroobe P Futschik ASchleurootterer C 2013 A genome-wide fine-scale map of natural pig-mentation variation in Drosophila melanogaster PLoS Genet9e1003534
Bergland AO Behrman EL OrsquoBrien KR Schmidt PS Petrov DA 2014Genomic evidence of rapid and stable adaptive oscillations overseasonal time scales in Drosophila PLoS Genet 10e1004775
Bonebrake TC Mastrandrea MD 2010 Tolerance adaptation and pre-cipitation changes complicate latitudinal patterns of climate changeimpacts Proc Natl Acad Sci U S A 10712581ndash12586
Bradley TJ 2009 Animal osmoregulation Oxford Oxford UniversityPress
Burke MK Liti G Long AD 2014 Standing genetic variation drivesrepeatable experimental evolution in outcrossing populations ofSaccharomyces cerevisiae Mol Biol Evol 313228ndash3239
Chown SL 2012 Trait-based approaches to conservation physiologyforecasting environmental change risks from the bottom upPhilos Trans R Soc B 3671615ndash1627
Chown SL Nicolson SW 2004 Insect physiological ecology mechanismsand patterns Oxford Oxford University Press
Chown SL Soslashrensen JG Terblanche JS 2011 Water loss in insectsan environmental change perspective J Insect Physiol571070ndash1084
Not in GWAS In GWAS
Not in Telonis-Scott et al (2012) N ndash intersect-AndashB A-intersect
In Telonis-Scott et al (2012) B-intersect Intersect(AB)
13
Chung H Loehlin DW Dufour HD Vaccarro K Millar JG Carroll SB2014 A single gene affects both ecological divergence and matechoice in Drosophila Science 3431148ndash1151
Cingolani P Platts A Wang LL Coon M Nguyen T Wang L Land SJ LuX Ruden DM 2012 A program for annotating and predicting theeffects of single nucleotide polymorphisms SnpEff SNPs in thegenome of Drosophila melanogaster strain w1118 iso-2 iso-3 Fly(Austin) 680ndash92
Civelek M Lusis AJ 2014 Systems genetics approaches to understandcomplex traits Nat Rev Genet 1534ndash48
Clusella-Trullas S Blackburn TM Chown SL 2011 Climatic predictors oftemperature performance curve parameters in ectotherms implycomplex responses to climate change Am Nat 177738ndash751
Csardi G Nepusz T 2006 The igraph software package for complexnetwork research InterJournal Complex Systems 1695 Availablefrom httpigraphorg
Davies SA Cabrero P Overend G Aitchison L Sebastian S Terhzaz SDow JA 2014 Cell signalling mechanisms for insect stress toleranceJ Exp Biol 217119ndash128
Davies SA Cabrero P Povsic M Johnston NR Terhzaz S Dow JA 2013Signaling by Drosophila capa neuropeptides Gen Comp Endocrinol18860ndash66
Davies SA Overend G Sebastian S Cundall M Cabrero P Dow JATerhzaz S 2012 Immune and stress response lsquocross-talkrsquo in theDrosophila Malpighian tubule J Insect Physiol 58488ndash497
Day JP Dow JA Houslay MD Davies SA 2005 Cyclic nucleotide phos-phodiesterases in Drosophila melanogaster Biochem J 388333ndash342
de Nadal E Ammerer G Posas F 2011 Controlling gene expression inresponse to stress Nat Rev Genet 12833ndash845
DePristo MA Banks E Poplin R Garimella KV Maguire JR Hartl CPhilippakis AA del Angel G Rivas MA Hanna M et al 2011 Aframework for variation discovery and genotyping using next-gen-eration DNA sequencing data Nat Genet 43491ndash498
Djawdan M Chippindale AK Rose MR Bradley TJ 1998 Metabolic re-serves and evolved stress resistance in Drosophila melanogasterPhysiol Zool 71584ndash594
Durham MF Magwire MM Stone EA Leips J 2014 Genome-wide anal-ysis in Drosophila reveals age-specific effects of SNPs on fitness traitsNat Commun 54338
Emilsson V Thorleifsson G Zhang B Leonardson AS Zink F Zhu JCarlson S Helgason A Walters GB Gunnarsdottir S et al 2008Genetics of gene expression and its effect on disease Nature452423ndash428
Fabian DK Kapun M Nolte V Kofler R Schmidt PS Schleurootterer C FlattT 2012 Genome-wide patterns of latitudinal differentiation amongpopulations of Drosophila melanogaster from North America MolEcol 214748ndash4769
Foley BR Telonis-Scott M 2011 Quantitative genetic analysis suggestscausal association between cuticular hydrocarbon composition anddesiccation survival in Drosophila melanogaster Heredity 10668ndash77
Folk DG Han C Bradley TJ 2001 Water acquisition and partitioning inDrosophila melanogaster effects of selection for desiccation-resis-tance J Exp Biol 2043323ndash3331
Fowler MA Montell C 2013 Drosophila TRP channels and animal be-havior Life Sci 92394ndash403
Fu J Keurentjes JJB Bouwmeester H America T Verstappen FWA WardJL Beale MH de Vos RCH Dijkstra M Scheltema RA et al 2009System-wide molecular evidence for phenotypic buffering inArabidopsis Nat Genet 41166ndash167
Futschik A Schleurootterer C 2010 The next generation of molecular mar-kers from massively parallel sequencing of pooled DNA samplesGenetics 186207ndash218
Gibbs AG Fukuzato F Matzkin LM 2003 Evolution of water conserva-tion mechanisms in Drosophila J Exp Biol 2061183ndash1192
Gibbs AG Matzkin LM 2001 Evolution of water balance in the genusDrosophila J Exp Biol 2042331ndash2338
Goodstadt L 2010 Ruffus a lightweight Python library for computa-tional pipelines Bioinformatics 262778ndash2779
Hadley NF 1994 Water relations of terrestrial arthropods San Diego(CA) Academic Press
Hoffmann AA Hallas R Sinclair C Mitrovski P 2001 Levels of variationin stress resistance in Drosophila among strains local populationsand geographic regions patterns for desiccation starvation coldresistance and associated traits Evolution 551621ndash1630
Hoffmann AA Parsons PA 1989a An integrated approach to environ-mental stress tolerance and life-history variation desiccation toler-ance in Drosophila Biol J Linn Soc 37117ndash136
Hoffmann AA Parsons PA 1989b Selection for increased desiccationresistance in Drosophila melanogaster additive genetic control andcorrelated responses for other stresses Genetics 122837ndash845
Hoffmann AA Sgro CM 2011 Climate change and evolutionary adap-tation Nature 470479ndash485
Hoffmann AA Weeks AR 2007 Climatic selection on genes and traitsafter a 100 year-old invasion a critical look at the temperate-tropicalclines in Drosophila melanogaster from eastern Australia Genetica129133ndash147
Huang W Richards S Carbone MA Zhu D Anholt RR Ayroles JFDuncan L Jordan KW Lawrence F Magwire MM et al 2012Epistasis dominates the genetic architecture of Drosophila quanti-tative traits Proc Natl Acad Sci U S A 10915553ndash15559
Huang Z Tunnacliffe A 2004 Response of human cells to desiccationcomparison with hyperosmotic stress response J Physiol558181ndash191
Huang Z Tunnacliffe A 2005 Gene induction by desiccation stress inhuman cell cultures FEBS Lett 5794973ndash4977
Ichihashi Y Aguilar-Martınez JA Farhi M Chitwood DH Kumar RMillon LV Peng J Maloof JN Sinha NR 2014 Evolutionary develop-mental transcriptomics reveals a gene network module regulatinginterspecific diversity in plant leaf shape Proc Natl Acad Sci U S A111E2616ndashE2621
Itoh M Nanba N Hasegawa M Inomata N Kondo R Oshima MTakano-Shimizu T 2010 Seasonal changes in the long-distance link-age disequilibrium in Drosophila melanogaster J Hered 10126ndash32
Johnson WA Carder JW 2012 Drosophila nociceptors mediate larvalaversion to dry surface environments utilizing both the painless TRPchannel and the DEGENaC subunit PPK1 PLoS One 7e32878
Kahsai L Kapan N Dircksen H Winther AM Nassel DR 2010 Metabolicstress responses in Drosophila are modulated by brain neurosecre-tory cells that produce multiple neuropeptides PLoS One 5e11480
Kapun M van Schalkwyk H McAllister B Flatt T Schleurootterer C 2014Inference of chromosomal inversion dynamics from Pool-Seq datain natural and laboratory populations of Drosophila melanogasterMol Ecol 231813ndash1827
Kawano T Shimoda M Matsumoto H Ryuda M Tsuzuki S Hayakawa Y2010 Identification of a gene Desiccate contributing to desicca-tion resistance in Drosophila melanogaster J Biol Chem28538889ndash38897
Kellermann V van Heerwaarden B Sgro CM Hoffmann AA 2009Fundamental evolutionary limits in ecological traits driveDrosophila species distributions Science 3251244ndash1246
Kofler R Nolte V Schleurootterer C 2016 The impact of library preparationprotocols on the consistency of allele frequency estimates in Pool-Seq data Mol Ecol Resour 16118ndash122
Kofler R Pandey RV Schleurootterer C 2011 PoPoolation2 identifying dif-ferentiation between populations using sequencing of pooled DNAsamples (Pool-Seq) Bioinformatics 273435ndash3436
Kofler R Schleurootterer C 2012 Gowinda unbiased analysis of gene setenrichment for genome-wide association studies Bioinformatics282084ndash2085
Leiserson MDM Eldridge JV Ramachandran S Raphael BJ 2013Network analysis of GWAS data Curr Opin Genet Dev 23602ndash610
Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 251754ndash1760
Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079
14
Liu Y Luo J Carlsson MA Neuroassel DR 2015 Serotonin and insulin-likepeptides modulate leucokinin-producing neurons that affectfeeding and water homeostasis in Drosophila J Comp Neurol5231840ndash1863
Lohse M Bolger AM Nagel A Fernie AR Lunn JE Stitt M Usadel B 2012RobiNA a user-friendly integrated software solution for RNA-Seq-based transcriptomics Nucleic Acids Res 40W622ndashW627
Long A Liti G Luptak A Tenaillon O 2015 Elucidating the moleculararchitecture of adaptation via evolve and resequence experimentsNat Rev Genet 16567ndash582
Lynch M Bost D Wilson S Maruki T Harrison S 2014 Population-genetic inference from pooled-sequencing data Genome Biol Evol61210ndash1218
Mackay TFC Stone EA Ayroles JF 2009 The genetics of quantitativetraits challenges and prospects Nat Rev Genet 10565ndash577
Marron MT Markow TA Kain KJ Gibbs AG 2003 Effects of starvationand desiccation on energy metabolism in desert and mesicDrosophila J Insect Physiol 49261ndash270
Matzkin LM Markow TA 2009 Transcriptional regulation of metabo-lism associated with the increased desiccation resistance of thecactophilic Drosophila mojavensis Genetics 1821279ndash1288
Murali T Pacifico S Yu J Guest S Roberts GG Finley RL Jr 2011 DroID2011 a comprehensive integrated resource for protein transcrip-tion factor RNA and gene interactions for Drosophila Nucleic AcidsRes 39D736ndashD743
Nuzhdin SV Turner TL 2013 Promises and limitations of hitchhikingmapping Curr Opin Genet Dev 23694ndash699
Orozco-terWengel P Kapun M Nolte V Kofler R Flatt T Schleurootterer C2012 Adaptation of Drosophila to a novel laboratory environmentreveals temporally heterogeneous trajectories of selected alleles MolEcol 214931ndash4941
Picard Picard A set of Java command line tools for manipulating high-throughput seqeuncing (HTS) data and formats Available fromhttpbroadinstitutegithubiopicard
Qiu Y Tittiger C Wicker-Thomas C Le Goff G Young S Wajnberg EFricaux T Taquet N Blomquist GJ Feyereisen R 2012 An insect-specific P450 oxidative decarbonylase for cuticular hydrocarbon bio-synthesis Proc Natl Acad Sci U S A 10914858ndash14863
Rajpurohit S Oliveira CC Etges WJ Gibbs AG 2013 Functional genomicand phenotypic responses to desiccation in natural populations of adesert Drosophilid Mol Ecol 222698ndash2715
Sarup P Soslashrensen JG Kristensen TN Hoffmann AA Loeschcke V PaigeKN Soslashrensen P 2011 Candidate genes detected in transcriptomestudies are strongly dependent on genetic background PLoS One6e15644
Schiffer M Hangartner S Hoffmann AA 2013 Assessing the relativeimportance of environmental effects carry-over effects and speciesdifferences in thermal stress resistance a comparison ofDrosophilids across field and laboratory generations J Exp Biol2163790ndash3798
Schleurootterer C Kofler R Versace E Tobler R Franssen SU 2015Combining experimental evolution with next-generation sequen-cing A powerful tool to study adaptation from standing geneticvariation Heredity 114431ndash440
Sinclair BJ Gibbs AG Roberts SP 2007 Gene transcription during ex-posure to and recovery from cold and desiccation stress inDrosophila melanogaster Insect Mol Biol 16435ndash443
Sloggett C Wakefield M Philip G Pope B 2014 Rubramdashflexible distrib-uted pipelines figshare Available from httpdxdoiorg106084m9figshare895626
Smit AFA Hubley R Green P 2013-2015 RepeatMasker Open-403Available from httpwwwrepeatmaskerorg
Seurooderberg JA Birse RT Neuroassel DR 2011 Insulin production and signalingin renal tubules of Drosophila is under control of tachykinin-relatedpeptide and regulates stress resistance PLoS One 6e19866
Soslashrensen JG Nielsen MM Loeschcke V 2007 Gene expression profileanalysis of Drosophila melanogaster selected for resistance to envi-ronmental stressors J Evol Biol 201624ndash1636
St Pierre SE Ponting L Stefancsik R McQuilton P 2014 FlyBase102mdashadvanced approaches to interrogating FlyBase Nucleic AcidsRes 42D780ndashD788
Stevenson M Nunes T Heuser C Marshall J Sanchez J Thornton RReiczigel J Robison-Cox J Sebastiani P Solymos P et al 2015 epiRTools for the analysis of epidemiological data R package version09-62 Available from httpCRANR-projectorgpackagefrac14epiR
Storey JD Tibshirani R 2003 Statistical significance for genomewidestudies Proc Natl Acad Sci U S A 1009440ndash9445
Telonis-Scott M Gane M DeGaris S Sgro CM Hoffmann AA 2012 Highresolution mapping of candidate alleles for desiccation resistancein Drosophila melanogaster under selection Mol Biol Evol291335ndash1351
Telonis-Scott M Guthridge KM Hoffmann AA 2006 A new set oflaboratory-selected Drosophila melanogaster lines for the analysisof desiccation resistance response to selection physiology and cor-related responses J Exp Biol 2091837ndash1847
Terhzaz S Cabrero P Robben JH Radford JC Hudson BD Milligan GDow JA Davies SA 2012 Mechanism and function of Drosophilacapa GPCR a desiccation stress-responsive receptor with functionalhomology to human neuromedinU receptor PLoS One 7e29897
Terhzaz S Overend G Sebastian S Dow JA Davies SA 2014 The Dmelanogaster capa-1 neuropeptide activates renal NF-kB signalingPeptides 53218ndash224
Terhzaz S Teets NM Cabrero P Henderson L Ritchie MG Nachman RJDow JA Denlinger DL Davies SA 2015 Insect capa neuropeptidesimpact desiccation and cold tolerance Proc Natl Acad Sci U S A1122882ndash2887
Thorat LJ Gaikwad SM Nath BB 2012 Trehalose as an indicator ofdesiccation stress in Drosophila melanogaster larvae a potentialmarker of anhydrobiosis Biochem Biophys Res Commun419638ndash642
Tobler R Franssen SU Kofler R Orozco-terWengel P Nolte VHermisson J Schl eurootterer C 2014 Massive habitat-specific ge-nomic response in D melanogaster populations during exper-imental evolution in hot and cold environments Mol Biol Evol31364ndash375
Turner TL Bourne EC Von Wettberg EJ Hu TT Nuzhdin SV 2010Population resequencing reveals local adaptation of Arabidopsislyrata to serpentine soils Nat Genet 42260ndash263
Turner TL Miller PM 2012 Investigating natural variation in Drosophilacourtship song by the evolve and resequence approach Genetics191633ndash642
Vermeulen CJ Soslashrensen P Kirilova Gagalova K Loeschcke V 2013Transcriptomic analysis of inbreeding depression in cold-sensitiveDrosophila melanogaster shows upregulation of the immune re-sponse J Evol Biol 261890ndash1902
Wang B Goode J Best J Meltzer J Schilman PE Chen J Garza D ThomasJB Montminy M 2008 The insulin-regulated CREB coactivatorTORC promotes stress resistance in Drosophila Cell Metab7434ndash444
Wang J Kean L Yang J Allan AK Davies SA Herzyk P Dow JA 2004Function-informed transcriptome analysis of Drosophila renaltubule Genome Biol 5R69
Wang K Li M Hakonarson H 2010 Analysing biological pathways ingenome-wide association studies Nat Rev Genet 11843ndash854
Wang L Jia P Wolfinger RD Chen X Zhao Z 2011 Gene set analysis ofgenome-wide association studies methodological issues and per-spectives Genomics 981ndash8
Weber AL Khan GF Magwire MM Tabor CL Mackay TF Anholt RR2012 Genome-wide association analysis of oxidative stress resistancein Drosophila melanogaster PLoS One 7e34745
Weeks AR McKechnie SW Hoffmann AA 2002 Dissecting adaptiveclinal variation markers inversions and sizestress associations inDrosophila melanogaster from a central field population Ecol Lett5756ndash763
Zhu Y Bergland AO Gonzalez J Petrov DA 2012 Empirical validation ofpooled whole genome population re-sequencing in Drosophila mel-anogaster PLoS One 7e41901
15
Data Processing and Mapping
Processing was performed on a high-performance computingcluster using the Rubra pipeline system that makes use of theRuffus Python library (Goodstadt 2010) The final variant-calling pipeline was based on a pipeline developed by ClareSloggett (Sloggett et al 2014) and is publicly available athttpsgithubcomgriffinpGWAS_pipeline (last accessedJanuary 13 2016) Raw sequence reads were processed persample per lane with P1 and P2 read files processed sepa-rately initially Trimming was performed with trimmomatic v030 (Lohse et al 2012) Adapter sequences were also re-moved Leading and trailing bases with a quality scorebelow 30 were trimmed and a sliding window (width frac14 10quality threshold frac14 25) was used Reads shorter than 40 bpafter trimming were discarded Quality before and after trim-ming was examined using FastQC v 0101 (Andrews 2014)
Posttrimming reads that remained paired were used asinput into the alignment step Alignment to the D melano-gaster reference genome version r553 was performed withbwa v 062 (Li and Durbin 2009) using the bwa aln commandand the following options No seeding (-l 150) 1 rate ofmissing alignments given 2 uniform base error rate (-n001) a maximum of 2 gap opens per read (-o 2) a maximumof 12 gap extensions per read (-e 12) and long deletionswithin 12 bp of 30 end disallowed (-d 12) The default valueswere used for all other options Output files were then con-verted to sorted bam files using the bwa sampe commandand the ldquoSortSamrdquo option in Picard v 196
At this point the bam files were merged into one file persample using PicardMerge Duplicates were identified withPicard MarkDuplicates The tool RealignerTargetCreator inGATK v 26-5 (DePristo et al 2011) was applied to the tenbam files simultaneously producing a list of intervals contain-ing probable indels around which local realignment would beperformed This was used as input for the IndelRealigner toolwhich was also applied to all ten bam files simultaneously
To investigate the variation among technical replicatesallele frequency was estimated based on read counts foreach technical replicate separately for 100000 randomly cho-sen SNPs across the genome After excluding candidate des-iccation-resistance SNPs and SNPs that may have been falsepositives due to sequencing error (those with mean frequencylt004 across replicates) 1803 SNPs remained For each locuswe calculated the variance in allele frequency estimate amongthe five control replicates and the variance among the fivedesiccation-resistance replicates We compared this with thevariance due to pool category (control vs desiccation resis-tant) using analysis of variance (ANOVA) We also calculatedthe mean pairwise difference in allele frequency estimateamong control and among desiccation-resistance replicatesand the mean concordance correlation coefficient was calcu-lated over all pairwise comparisons within pool category(using the epiccc function in the epiR package Stevensonet al 2015)
The following approach was then taken to identify SNPsdiffering between the desiccation-resistant and controlgroups Based on the technical replicate analysis we merged
the five ldquodesiccation-resistantrdquo and the five random controlsamples into two bam files A pileup file was created withsamtools v 0119 (Li et al 2009) and converted to a ldquosyncrdquo fileusing the mpileup2sync script in Popoolation2 v 1201 (Kofleret al 2011) retaining reads with a mapping quality of at least20 and bases with a minimum quality score of 20 Reads withunmapped mates were also retained (option ndashA in samtoolsmpileup) For this step we used a repeat-masked version ofthe reference genome created with RepeatMasker 403 (Smitet al 2013-2015) and a repeat file containing all annotatedtransposons from D melanogaster including shared ancestralsequences (Flybase release r557) plus all repetitive elementsfrom RepBase release v1904 for D melanogaster Simple re-peats small RNA genes and bacterial insertions were notmasked (options ndashnolow ndashnorna ndashno_is) The rmblastsearch engine was used
Association Analysis
We then performed Fisherrsquos exact test with the Fisher testscript in Popoolation2 (Kofler et al 2011) We set the min-imum count (of the minor allele) frac14 10 minimum coverage(in both populations) frac14 20 and maximum coverage frac1410000 The test was performed for each SNP (slidingwindow off) Although there are numerous approaches tocalculating genome-wide significance thresholds in aPoolseq framework currently there is no consensus on anappropriate standardized statistical method Methods in-clude simulation approaches (Orozco-terWengel et al2012 Bastide et al 2013 Burke et al 2014) correctionusing a null distribution based on estimating genetic driftfrom replicated artificial selection line allele frequency vari-ance (Turner and Miller 2012) use of a simple thresholdbased on a minimum allele frequency difference (Turneret al 2010) and the FDR correction suggested by Storeyand Tibshirani (2003) (used by Fabian et al 2012)
We determined based on our design to calculate the FDRthreshold by comparison with a simulated distribution of nullP values following the approach of Bastide et al (2013) Foreach SNP locus a P value was calculated by simulating thedistribution of read counts between the major and minorallele according to a beta-binomial distribution with meana(a+b) and variance (ab)((a+b)2(a+b+1)) Thismodel allowed for two types of sampling variationStochastic variation in allele sampling across the phenotypesand the background variation in the representation of allelesin each pool caused by library preparation artifacts or sto-chastic variation driven by unequal coverage of the poolsThese sources of variation have also been recognized else-where as being important in interpreting pooled sequenceallele calls and identifying significant differentiation betweenpooled samples (Lynch et al 2014) The value of afrac14 37 waschosen by chi-square test as the best match to the observedP value distribution (afrac14 10 to afrac14 40 were tested) with acorresponding bfrac14 4315 to reflect the coverage depth differ-ence between the random control (487) and desiccation-resistant (568) pools These values were then used tosimulate a null data set 10 the size of the observed data
11
set Because the null P value distribution still did not fit theobserved P value distribution particularly well (supplemen-tary fig S2 Supplementary Material online) these values werenot used for a standard FDR correction Instead a P valuethreshold was calculated based on the lowest 005 of thenull P values similar in approach to Orozco-terWengel et al(2012)
Inversion Analysis
In D melanogaster several cosmopolitan inversion polymor-phisms show cross-continent latitudinal patterns that explaina substantial proportion of clinal variation for many traitsincluding thermal tolerance (reviewed in Hoffmann andWeeks 2007) Patterns of disequilibria generated betweenloci in the vicinity of the rearrangement can obscure signalsof selection (Hoffmann and Weeks 2007) and inversion fre-quencies are routinely considered in genomic studies ofnatural populations sampled from known latitudinal clines(Fabian et al 2012 Kapun et al 2014) We assessedassociations between inversion frequencies and our controland resistant pools at SNPs diagnostic for the following in-versions In3R(Payne) (Anderson et al 2005 Kapun et al2014) In(2L)t (Andolfatto et al 1999 Kapun et al 2014)In(2R)Ns In(3L)P In(3R)C In(3R)K and In(3R)Mo (Kapunet al 2014) Between one and five diagnostic SNPs were in-vestigated for each inversion (depending on availability) Weconsidered an inversion to be associated with desiccationresistance if the frequency of its diagnostic SNP differed sig-nificantly between the control and desiccation-resistant poolsby a Fisherrsquos exact test
Genome Feature Annotation
The list of candidate SNPs was converted to vcf format using acustom R script setting the alternate allele as the allele thatincreased in frequency from the control to the desiccation-resistant pool and setting the reference allele as the otherallele present in cases where both the major and minor allelesdiffered from the reference genome The genomic features inwhich the candidate SNPs occurred were annotated usingsnpEff v40 first creating a database from the D melanogasterv553 genome annotation with the default approach(Cingolani et al 2012) The default annotation settings wereused except for setting the parameter ldquo-ud 0rdquo to annotateSNPs to genes only if they were within the gene body Using2 tests on SNP counts we tested for over- or under-enrichment of feature types by calculating the proportionof lsquoallrsquo features from all SNPs detected in the study and theproportion of ldquocandidaterdquo SNPs identified as differentiated indesiccation-resistant flies (Fabian et al 2012)
GO Analysis
GO analyses was used to associate the candidates with theirbiological functions (Ashburner et al 2000) Previously devel-oped for biological pathway analysis of global gene expressionstudies (Wang et al 2010) typical GO analysis assumes that allgenes are sampled independently with equal probabilityGWAS violate these criteria where SNPs are more likely to
occur in longer genes than in shorter genes resulting in highertype I error rates in GO categories overpopulated with longgenes Gene length as well as overlapping gene biases (Wanget al 2011) were accounted for using Gowinda (Kofler andSchleurootterer 2012) which implements a permutation strategyto reduce false positives Gowinda does not reconstruct LDbetween SNPs but operates in two modes underpinned bytwo extreme assumptions 1) Gene mode (assumes all SNPs ina gene are in LD) and 2) SNP mode (assumes all SNPs in agene are completely independent)
The candidates were examined in both modes but giventhe stringency of gene mode high-resolution SNP modeoutput was utilized for the final analysis with the followingparameters 106 simulations minimum significance frac141 andminimum genes per GO categoryfrac14 3 (to exclude gene-poorcategories) P values were corrected using the Gowinda FDRalgorithm GO annotations from Flybase r557 were applied
Candidate Gene Analysis
In conjunction with the GO analysis we also examinedknown candidate genes associated with survival under lowRH to better characterize possible biological functions of ourGWAS candidates We took a conservative approach to can-didate gene assignment where the SNPs were mapped withinthe entire gene region without up- and downstream exten-sions Genes were curated from FlyBase and the literature(table 1) predominantly using detailed functional experimen-tal evidence as criteria for functional desiccation candidates
Network Analysis
We next obtained a higher level summary of the candidategene list using PPI network analysis The full candidate genelist was used to construct a first-order interaction networkusing all Drosophila PPIs listed in the Drosophila InteractionDatabase v 2014_10 (avoiding interologs) (Murali et al 2011)This ldquofullrdquo PPI network includes manually curated data frompublished literature and experimental data for 9633 proteinsand 93799 interactions For the network construction theigraph R package was used to build a subgraph from the fullPPI network containing the candidate proteins and their first-order interacting proteins (Csardi and Nepusz 2006) Self-connections were removed Functional enrichment analysiswas conducted on the list of network genes using FlyMinev400 (wwwflymineorg last accessed January 13 2016) withthe following parameters for GO enrichment and KEGGReactome pathway enrichment BenjaminindashHochberg FDRcorrection P lt 005 background frac14 all D melanogastergenes in our full PPI network and Ontology (for GO enrich-ment only) frac14 biological process or molecular function
Cross-Study Comparison Overlap with Our PreviousWork
Following on from the comparison between our gene list andthe candidate genes suggested from the literature we quan-titatively evaluated the degree of overlap between the candi-date genes identified in this study with alleles mapped in ourprevious study of flies collected from the same geographic
12
region (Telonis-Scott et al 2012) The ldquooverlaprdquo analysis wasperformed similarly for the GWAS candidate gene and net-work-level approaches First the gene list overlap was testedbetween the two studies using Fisherrsquos exact test where the382 genes identified from the candidate SNPs (this study)were compared with 416 genes identified from candidatesingle feature polymorphisms (Telonis-Scott et al 2012)The contingency table was constructed using the numberof annotated genes in the D melanogaster R553 genomebuilt as an approximation for the total number of genestested in each study and took the following form
where A frac14 gene list from Telonis-Scott et al (2012)n frac14 416 B frac14 gene list from this study n frac14 382 N frac14 totalnumber of genes in D melanogaster reference genome r553n frac14 17106
To annotate genome features in the overlap candidategene list we used the SNP data (this study) which betterresolves alleles than array-based genotyping Due to this lim-itation we did not compare exact genomic coordinates be-tween the studies but considered overlap to occur whendifferentiated alleles were detected in the same gene inboth studies We did however examine the allele frequenciesaround the Dys-RF promoter which was sequenced sepa-rately revealing a selective sweep in the experimental evolu-tion lines (Telonis-Scott et al 2012)
The overlapping candidates were tested for functional en-richment using a basic gene list approach As genes ratherthan SNPs were analyzed here Fly Mine v400 (wwwflymineorg) was applied with the following parameters Gene lengthnormalization BenjaminindashHochberg FDR correction Plt 005Ontology frac14 biological process and default background (allGO-annotated D melanogaster genes)
Finally we constructed a second PPI network from thecandidate genes mapped in Telonis-Scott et al (2012) to ex-amine system-level overlap between the two studies The 416genes mapped to 337 protein seeds and the network wasconstructed as described above The degree of network over-lap between the two studies was tested using simulation Ineach iteration a gene set of the same length as the list fromthis study and a gene set of the same length as the list fromTelonis-Scott et al (2012) were resampled randomly from theentire D melanogaster mapped gene annotation (r553) Eachresampled set was then used to build a first-order network aspreviously described The overlap network between the tworesampled sets was calculated using the graphintersectioncommand from the igraph package and zero-degree nodes(proteins with no interactions) were removed Overlap wasquantified by counting the number of nodes and the numberof edges in this overlap network The observed overlap wascompared with a simulated null distribution for both nodeand edge number where n frac14 1000 simulation iterations
Supplementary MaterialSupplementary figures S1ndashS4 and tables S1ndashS5 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
We thank Robert Good for the fly sampling and JenniferShirriffs and Lea and Alan Rako for outstanding assistancewith laboratory culture and substantial desiccation assaysWe are grateful to Melissa Davis for advice on network build-ing and comparison approaches and to five anonymous re-viewers whose comments improved the manuscript Thiswork was supported by a DECRA fellowship DE120102575and Monash University Technology voucher scheme toMTS an ARC Future Fellowship and Discovery Grants toCMS an ARC Laureate Fellowship to AAH and a Scienceand Industry Endowment Fund grant to CMS and AAH
ReferencesAlexander JM Lomvardas S 2014 Nuclear architecture as an epigenetic
regulator of neural development and function Neuroscience26439ndash50
Anderson AR Hoffmann AA McKechnie SW Umina PA Weeks AR2005 The latitudinal cline in the In(3R)Payne inversion polymor-phism has shifted in the last 20 years in Australian Drosophilamelanogaster populations Mol Ecol 14851ndash858
Andolfatto P Wall JD Kreitman M 1999 Unusual haplotype structureat the proximal breakpoint of In(2L)t in a natural population ofDrosophila melanogaster Genetics 1531297ndash1311
Andrews S 2014 FastQC Available from httpwwwbioinformaticsbbsrcacukprojectsfastqc
Ashburner M Ball CA Blake JA Botstein D Butler H Cherry JM DavisAP Dolinski K Dwight SS Eppig JT et al 2000 Gene ontology toolfor the unification of biology The Gene Ontology Consortium NatGenet 2525ndash29
Aytes A Mitrofanova A Lefebvre C Alvarez Mariano J Castillo-MartinM Zheng T Eastham James A Gopalan A Pienta Kenneth J ShenMM et al 2014 Cross-species regulatory network analysis identifiesa synergistic interaction between FOXM1 and CENPF that drivesprostate cancer malignancy Cancer Cell 25638ndash651
Bastide H Betancourt A Nolte V Tobler R Steuroobe P Futschik ASchleurootterer C 2013 A genome-wide fine-scale map of natural pig-mentation variation in Drosophila melanogaster PLoS Genet9e1003534
Bergland AO Behrman EL OrsquoBrien KR Schmidt PS Petrov DA 2014Genomic evidence of rapid and stable adaptive oscillations overseasonal time scales in Drosophila PLoS Genet 10e1004775
Bonebrake TC Mastrandrea MD 2010 Tolerance adaptation and pre-cipitation changes complicate latitudinal patterns of climate changeimpacts Proc Natl Acad Sci U S A 10712581ndash12586
Bradley TJ 2009 Animal osmoregulation Oxford Oxford UniversityPress
Burke MK Liti G Long AD 2014 Standing genetic variation drivesrepeatable experimental evolution in outcrossing populations ofSaccharomyces cerevisiae Mol Biol Evol 313228ndash3239
Chown SL 2012 Trait-based approaches to conservation physiologyforecasting environmental change risks from the bottom upPhilos Trans R Soc B 3671615ndash1627
Chown SL Nicolson SW 2004 Insect physiological ecology mechanismsand patterns Oxford Oxford University Press
Chown SL Soslashrensen JG Terblanche JS 2011 Water loss in insectsan environmental change perspective J Insect Physiol571070ndash1084
Not in GWAS In GWAS
Not in Telonis-Scott et al (2012) N ndash intersect-AndashB A-intersect
In Telonis-Scott et al (2012) B-intersect Intersect(AB)
13
Chung H Loehlin DW Dufour HD Vaccarro K Millar JG Carroll SB2014 A single gene affects both ecological divergence and matechoice in Drosophila Science 3431148ndash1151
Cingolani P Platts A Wang LL Coon M Nguyen T Wang L Land SJ LuX Ruden DM 2012 A program for annotating and predicting theeffects of single nucleotide polymorphisms SnpEff SNPs in thegenome of Drosophila melanogaster strain w1118 iso-2 iso-3 Fly(Austin) 680ndash92
Civelek M Lusis AJ 2014 Systems genetics approaches to understandcomplex traits Nat Rev Genet 1534ndash48
Clusella-Trullas S Blackburn TM Chown SL 2011 Climatic predictors oftemperature performance curve parameters in ectotherms implycomplex responses to climate change Am Nat 177738ndash751
Csardi G Nepusz T 2006 The igraph software package for complexnetwork research InterJournal Complex Systems 1695 Availablefrom httpigraphorg
Davies SA Cabrero P Overend G Aitchison L Sebastian S Terhzaz SDow JA 2014 Cell signalling mechanisms for insect stress toleranceJ Exp Biol 217119ndash128
Davies SA Cabrero P Povsic M Johnston NR Terhzaz S Dow JA 2013Signaling by Drosophila capa neuropeptides Gen Comp Endocrinol18860ndash66
Davies SA Overend G Sebastian S Cundall M Cabrero P Dow JATerhzaz S 2012 Immune and stress response lsquocross-talkrsquo in theDrosophila Malpighian tubule J Insect Physiol 58488ndash497
Day JP Dow JA Houslay MD Davies SA 2005 Cyclic nucleotide phos-phodiesterases in Drosophila melanogaster Biochem J 388333ndash342
de Nadal E Ammerer G Posas F 2011 Controlling gene expression inresponse to stress Nat Rev Genet 12833ndash845
DePristo MA Banks E Poplin R Garimella KV Maguire JR Hartl CPhilippakis AA del Angel G Rivas MA Hanna M et al 2011 Aframework for variation discovery and genotyping using next-gen-eration DNA sequencing data Nat Genet 43491ndash498
Djawdan M Chippindale AK Rose MR Bradley TJ 1998 Metabolic re-serves and evolved stress resistance in Drosophila melanogasterPhysiol Zool 71584ndash594
Durham MF Magwire MM Stone EA Leips J 2014 Genome-wide anal-ysis in Drosophila reveals age-specific effects of SNPs on fitness traitsNat Commun 54338
Emilsson V Thorleifsson G Zhang B Leonardson AS Zink F Zhu JCarlson S Helgason A Walters GB Gunnarsdottir S et al 2008Genetics of gene expression and its effect on disease Nature452423ndash428
Fabian DK Kapun M Nolte V Kofler R Schmidt PS Schleurootterer C FlattT 2012 Genome-wide patterns of latitudinal differentiation amongpopulations of Drosophila melanogaster from North America MolEcol 214748ndash4769
Foley BR Telonis-Scott M 2011 Quantitative genetic analysis suggestscausal association between cuticular hydrocarbon composition anddesiccation survival in Drosophila melanogaster Heredity 10668ndash77
Folk DG Han C Bradley TJ 2001 Water acquisition and partitioning inDrosophila melanogaster effects of selection for desiccation-resis-tance J Exp Biol 2043323ndash3331
Fowler MA Montell C 2013 Drosophila TRP channels and animal be-havior Life Sci 92394ndash403
Fu J Keurentjes JJB Bouwmeester H America T Verstappen FWA WardJL Beale MH de Vos RCH Dijkstra M Scheltema RA et al 2009System-wide molecular evidence for phenotypic buffering inArabidopsis Nat Genet 41166ndash167
Futschik A Schleurootterer C 2010 The next generation of molecular mar-kers from massively parallel sequencing of pooled DNA samplesGenetics 186207ndash218
Gibbs AG Fukuzato F Matzkin LM 2003 Evolution of water conserva-tion mechanisms in Drosophila J Exp Biol 2061183ndash1192
Gibbs AG Matzkin LM 2001 Evolution of water balance in the genusDrosophila J Exp Biol 2042331ndash2338
Goodstadt L 2010 Ruffus a lightweight Python library for computa-tional pipelines Bioinformatics 262778ndash2779
Hadley NF 1994 Water relations of terrestrial arthropods San Diego(CA) Academic Press
Hoffmann AA Hallas R Sinclair C Mitrovski P 2001 Levels of variationin stress resistance in Drosophila among strains local populationsand geographic regions patterns for desiccation starvation coldresistance and associated traits Evolution 551621ndash1630
Hoffmann AA Parsons PA 1989a An integrated approach to environ-mental stress tolerance and life-history variation desiccation toler-ance in Drosophila Biol J Linn Soc 37117ndash136
Hoffmann AA Parsons PA 1989b Selection for increased desiccationresistance in Drosophila melanogaster additive genetic control andcorrelated responses for other stresses Genetics 122837ndash845
Hoffmann AA Sgro CM 2011 Climate change and evolutionary adap-tation Nature 470479ndash485
Hoffmann AA Weeks AR 2007 Climatic selection on genes and traitsafter a 100 year-old invasion a critical look at the temperate-tropicalclines in Drosophila melanogaster from eastern Australia Genetica129133ndash147
Huang W Richards S Carbone MA Zhu D Anholt RR Ayroles JFDuncan L Jordan KW Lawrence F Magwire MM et al 2012Epistasis dominates the genetic architecture of Drosophila quanti-tative traits Proc Natl Acad Sci U S A 10915553ndash15559
Huang Z Tunnacliffe A 2004 Response of human cells to desiccationcomparison with hyperosmotic stress response J Physiol558181ndash191
Huang Z Tunnacliffe A 2005 Gene induction by desiccation stress inhuman cell cultures FEBS Lett 5794973ndash4977
Ichihashi Y Aguilar-Martınez JA Farhi M Chitwood DH Kumar RMillon LV Peng J Maloof JN Sinha NR 2014 Evolutionary develop-mental transcriptomics reveals a gene network module regulatinginterspecific diversity in plant leaf shape Proc Natl Acad Sci U S A111E2616ndashE2621
Itoh M Nanba N Hasegawa M Inomata N Kondo R Oshima MTakano-Shimizu T 2010 Seasonal changes in the long-distance link-age disequilibrium in Drosophila melanogaster J Hered 10126ndash32
Johnson WA Carder JW 2012 Drosophila nociceptors mediate larvalaversion to dry surface environments utilizing both the painless TRPchannel and the DEGENaC subunit PPK1 PLoS One 7e32878
Kahsai L Kapan N Dircksen H Winther AM Nassel DR 2010 Metabolicstress responses in Drosophila are modulated by brain neurosecre-tory cells that produce multiple neuropeptides PLoS One 5e11480
Kapun M van Schalkwyk H McAllister B Flatt T Schleurootterer C 2014Inference of chromosomal inversion dynamics from Pool-Seq datain natural and laboratory populations of Drosophila melanogasterMol Ecol 231813ndash1827
Kawano T Shimoda M Matsumoto H Ryuda M Tsuzuki S Hayakawa Y2010 Identification of a gene Desiccate contributing to desicca-tion resistance in Drosophila melanogaster J Biol Chem28538889ndash38897
Kellermann V van Heerwaarden B Sgro CM Hoffmann AA 2009Fundamental evolutionary limits in ecological traits driveDrosophila species distributions Science 3251244ndash1246
Kofler R Nolte V Schleurootterer C 2016 The impact of library preparationprotocols on the consistency of allele frequency estimates in Pool-Seq data Mol Ecol Resour 16118ndash122
Kofler R Pandey RV Schleurootterer C 2011 PoPoolation2 identifying dif-ferentiation between populations using sequencing of pooled DNAsamples (Pool-Seq) Bioinformatics 273435ndash3436
Kofler R Schleurootterer C 2012 Gowinda unbiased analysis of gene setenrichment for genome-wide association studies Bioinformatics282084ndash2085
Leiserson MDM Eldridge JV Ramachandran S Raphael BJ 2013Network analysis of GWAS data Curr Opin Genet Dev 23602ndash610
Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 251754ndash1760
Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079
14
Liu Y Luo J Carlsson MA Neuroassel DR 2015 Serotonin and insulin-likepeptides modulate leucokinin-producing neurons that affectfeeding and water homeostasis in Drosophila J Comp Neurol5231840ndash1863
Lohse M Bolger AM Nagel A Fernie AR Lunn JE Stitt M Usadel B 2012RobiNA a user-friendly integrated software solution for RNA-Seq-based transcriptomics Nucleic Acids Res 40W622ndashW627
Long A Liti G Luptak A Tenaillon O 2015 Elucidating the moleculararchitecture of adaptation via evolve and resequence experimentsNat Rev Genet 16567ndash582
Lynch M Bost D Wilson S Maruki T Harrison S 2014 Population-genetic inference from pooled-sequencing data Genome Biol Evol61210ndash1218
Mackay TFC Stone EA Ayroles JF 2009 The genetics of quantitativetraits challenges and prospects Nat Rev Genet 10565ndash577
Marron MT Markow TA Kain KJ Gibbs AG 2003 Effects of starvationand desiccation on energy metabolism in desert and mesicDrosophila J Insect Physiol 49261ndash270
Matzkin LM Markow TA 2009 Transcriptional regulation of metabo-lism associated with the increased desiccation resistance of thecactophilic Drosophila mojavensis Genetics 1821279ndash1288
Murali T Pacifico S Yu J Guest S Roberts GG Finley RL Jr 2011 DroID2011 a comprehensive integrated resource for protein transcrip-tion factor RNA and gene interactions for Drosophila Nucleic AcidsRes 39D736ndashD743
Nuzhdin SV Turner TL 2013 Promises and limitations of hitchhikingmapping Curr Opin Genet Dev 23694ndash699
Orozco-terWengel P Kapun M Nolte V Kofler R Flatt T Schleurootterer C2012 Adaptation of Drosophila to a novel laboratory environmentreveals temporally heterogeneous trajectories of selected alleles MolEcol 214931ndash4941
Picard Picard A set of Java command line tools for manipulating high-throughput seqeuncing (HTS) data and formats Available fromhttpbroadinstitutegithubiopicard
Qiu Y Tittiger C Wicker-Thomas C Le Goff G Young S Wajnberg EFricaux T Taquet N Blomquist GJ Feyereisen R 2012 An insect-specific P450 oxidative decarbonylase for cuticular hydrocarbon bio-synthesis Proc Natl Acad Sci U S A 10914858ndash14863
Rajpurohit S Oliveira CC Etges WJ Gibbs AG 2013 Functional genomicand phenotypic responses to desiccation in natural populations of adesert Drosophilid Mol Ecol 222698ndash2715
Sarup P Soslashrensen JG Kristensen TN Hoffmann AA Loeschcke V PaigeKN Soslashrensen P 2011 Candidate genes detected in transcriptomestudies are strongly dependent on genetic background PLoS One6e15644
Schiffer M Hangartner S Hoffmann AA 2013 Assessing the relativeimportance of environmental effects carry-over effects and speciesdifferences in thermal stress resistance a comparison ofDrosophilids across field and laboratory generations J Exp Biol2163790ndash3798
Schleurootterer C Kofler R Versace E Tobler R Franssen SU 2015Combining experimental evolution with next-generation sequen-cing A powerful tool to study adaptation from standing geneticvariation Heredity 114431ndash440
Sinclair BJ Gibbs AG Roberts SP 2007 Gene transcription during ex-posure to and recovery from cold and desiccation stress inDrosophila melanogaster Insect Mol Biol 16435ndash443
Sloggett C Wakefield M Philip G Pope B 2014 Rubramdashflexible distrib-uted pipelines figshare Available from httpdxdoiorg106084m9figshare895626
Smit AFA Hubley R Green P 2013-2015 RepeatMasker Open-403Available from httpwwwrepeatmaskerorg
Seurooderberg JA Birse RT Neuroassel DR 2011 Insulin production and signalingin renal tubules of Drosophila is under control of tachykinin-relatedpeptide and regulates stress resistance PLoS One 6e19866
Soslashrensen JG Nielsen MM Loeschcke V 2007 Gene expression profileanalysis of Drosophila melanogaster selected for resistance to envi-ronmental stressors J Evol Biol 201624ndash1636
St Pierre SE Ponting L Stefancsik R McQuilton P 2014 FlyBase102mdashadvanced approaches to interrogating FlyBase Nucleic AcidsRes 42D780ndashD788
Stevenson M Nunes T Heuser C Marshall J Sanchez J Thornton RReiczigel J Robison-Cox J Sebastiani P Solymos P et al 2015 epiRTools for the analysis of epidemiological data R package version09-62 Available from httpCRANR-projectorgpackagefrac14epiR
Storey JD Tibshirani R 2003 Statistical significance for genomewidestudies Proc Natl Acad Sci U S A 1009440ndash9445
Telonis-Scott M Gane M DeGaris S Sgro CM Hoffmann AA 2012 Highresolution mapping of candidate alleles for desiccation resistancein Drosophila melanogaster under selection Mol Biol Evol291335ndash1351
Telonis-Scott M Guthridge KM Hoffmann AA 2006 A new set oflaboratory-selected Drosophila melanogaster lines for the analysisof desiccation resistance response to selection physiology and cor-related responses J Exp Biol 2091837ndash1847
Terhzaz S Cabrero P Robben JH Radford JC Hudson BD Milligan GDow JA Davies SA 2012 Mechanism and function of Drosophilacapa GPCR a desiccation stress-responsive receptor with functionalhomology to human neuromedinU receptor PLoS One 7e29897
Terhzaz S Overend G Sebastian S Dow JA Davies SA 2014 The Dmelanogaster capa-1 neuropeptide activates renal NF-kB signalingPeptides 53218ndash224
Terhzaz S Teets NM Cabrero P Henderson L Ritchie MG Nachman RJDow JA Denlinger DL Davies SA 2015 Insect capa neuropeptidesimpact desiccation and cold tolerance Proc Natl Acad Sci U S A1122882ndash2887
Thorat LJ Gaikwad SM Nath BB 2012 Trehalose as an indicator ofdesiccation stress in Drosophila melanogaster larvae a potentialmarker of anhydrobiosis Biochem Biophys Res Commun419638ndash642
Tobler R Franssen SU Kofler R Orozco-terWengel P Nolte VHermisson J Schl eurootterer C 2014 Massive habitat-specific ge-nomic response in D melanogaster populations during exper-imental evolution in hot and cold environments Mol Biol Evol31364ndash375
Turner TL Bourne EC Von Wettberg EJ Hu TT Nuzhdin SV 2010Population resequencing reveals local adaptation of Arabidopsislyrata to serpentine soils Nat Genet 42260ndash263
Turner TL Miller PM 2012 Investigating natural variation in Drosophilacourtship song by the evolve and resequence approach Genetics191633ndash642
Vermeulen CJ Soslashrensen P Kirilova Gagalova K Loeschcke V 2013Transcriptomic analysis of inbreeding depression in cold-sensitiveDrosophila melanogaster shows upregulation of the immune re-sponse J Evol Biol 261890ndash1902
Wang B Goode J Best J Meltzer J Schilman PE Chen J Garza D ThomasJB Montminy M 2008 The insulin-regulated CREB coactivatorTORC promotes stress resistance in Drosophila Cell Metab7434ndash444
Wang J Kean L Yang J Allan AK Davies SA Herzyk P Dow JA 2004Function-informed transcriptome analysis of Drosophila renaltubule Genome Biol 5R69
Wang K Li M Hakonarson H 2010 Analysing biological pathways ingenome-wide association studies Nat Rev Genet 11843ndash854
Wang L Jia P Wolfinger RD Chen X Zhao Z 2011 Gene set analysis ofgenome-wide association studies methodological issues and per-spectives Genomics 981ndash8
Weber AL Khan GF Magwire MM Tabor CL Mackay TF Anholt RR2012 Genome-wide association analysis of oxidative stress resistancein Drosophila melanogaster PLoS One 7e34745
Weeks AR McKechnie SW Hoffmann AA 2002 Dissecting adaptiveclinal variation markers inversions and sizestress associations inDrosophila melanogaster from a central field population Ecol Lett5756ndash763
Zhu Y Bergland AO Gonzalez J Petrov DA 2012 Empirical validation ofpooled whole genome population re-sequencing in Drosophila mel-anogaster PLoS One 7e41901
15
set Because the null P value distribution still did not fit theobserved P value distribution particularly well (supplemen-tary fig S2 Supplementary Material online) these values werenot used for a standard FDR correction Instead a P valuethreshold was calculated based on the lowest 005 of thenull P values similar in approach to Orozco-terWengel et al(2012)
Inversion Analysis
In D melanogaster several cosmopolitan inversion polymor-phisms show cross-continent latitudinal patterns that explaina substantial proportion of clinal variation for many traitsincluding thermal tolerance (reviewed in Hoffmann andWeeks 2007) Patterns of disequilibria generated betweenloci in the vicinity of the rearrangement can obscure signalsof selection (Hoffmann and Weeks 2007) and inversion fre-quencies are routinely considered in genomic studies ofnatural populations sampled from known latitudinal clines(Fabian et al 2012 Kapun et al 2014) We assessedassociations between inversion frequencies and our controland resistant pools at SNPs diagnostic for the following in-versions In3R(Payne) (Anderson et al 2005 Kapun et al2014) In(2L)t (Andolfatto et al 1999 Kapun et al 2014)In(2R)Ns In(3L)P In(3R)C In(3R)K and In(3R)Mo (Kapunet al 2014) Between one and five diagnostic SNPs were in-vestigated for each inversion (depending on availability) Weconsidered an inversion to be associated with desiccationresistance if the frequency of its diagnostic SNP differed sig-nificantly between the control and desiccation-resistant poolsby a Fisherrsquos exact test
Genome Feature Annotation
The list of candidate SNPs was converted to vcf format using acustom R script setting the alternate allele as the allele thatincreased in frequency from the control to the desiccation-resistant pool and setting the reference allele as the otherallele present in cases where both the major and minor allelesdiffered from the reference genome The genomic features inwhich the candidate SNPs occurred were annotated usingsnpEff v40 first creating a database from the D melanogasterv553 genome annotation with the default approach(Cingolani et al 2012) The default annotation settings wereused except for setting the parameter ldquo-ud 0rdquo to annotateSNPs to genes only if they were within the gene body Using2 tests on SNP counts we tested for over- or under-enrichment of feature types by calculating the proportionof lsquoallrsquo features from all SNPs detected in the study and theproportion of ldquocandidaterdquo SNPs identified as differentiated indesiccation-resistant flies (Fabian et al 2012)
GO Analysis
GO analyses was used to associate the candidates with theirbiological functions (Ashburner et al 2000) Previously devel-oped for biological pathway analysis of global gene expressionstudies (Wang et al 2010) typical GO analysis assumes that allgenes are sampled independently with equal probabilityGWAS violate these criteria where SNPs are more likely to
occur in longer genes than in shorter genes resulting in highertype I error rates in GO categories overpopulated with longgenes Gene length as well as overlapping gene biases (Wanget al 2011) were accounted for using Gowinda (Kofler andSchleurootterer 2012) which implements a permutation strategyto reduce false positives Gowinda does not reconstruct LDbetween SNPs but operates in two modes underpinned bytwo extreme assumptions 1) Gene mode (assumes all SNPs ina gene are in LD) and 2) SNP mode (assumes all SNPs in agene are completely independent)
The candidates were examined in both modes but giventhe stringency of gene mode high-resolution SNP modeoutput was utilized for the final analysis with the followingparameters 106 simulations minimum significance frac141 andminimum genes per GO categoryfrac14 3 (to exclude gene-poorcategories) P values were corrected using the Gowinda FDRalgorithm GO annotations from Flybase r557 were applied
Candidate Gene Analysis
In conjunction with the GO analysis we also examinedknown candidate genes associated with survival under lowRH to better characterize possible biological functions of ourGWAS candidates We took a conservative approach to can-didate gene assignment where the SNPs were mapped withinthe entire gene region without up- and downstream exten-sions Genes were curated from FlyBase and the literature(table 1) predominantly using detailed functional experimen-tal evidence as criteria for functional desiccation candidates
Network Analysis
We next obtained a higher level summary of the candidategene list using PPI network analysis The full candidate genelist was used to construct a first-order interaction networkusing all Drosophila PPIs listed in the Drosophila InteractionDatabase v 2014_10 (avoiding interologs) (Murali et al 2011)This ldquofullrdquo PPI network includes manually curated data frompublished literature and experimental data for 9633 proteinsand 93799 interactions For the network construction theigraph R package was used to build a subgraph from the fullPPI network containing the candidate proteins and their first-order interacting proteins (Csardi and Nepusz 2006) Self-connections were removed Functional enrichment analysiswas conducted on the list of network genes using FlyMinev400 (wwwflymineorg last accessed January 13 2016) withthe following parameters for GO enrichment and KEGGReactome pathway enrichment BenjaminindashHochberg FDRcorrection P lt 005 background frac14 all D melanogastergenes in our full PPI network and Ontology (for GO enrich-ment only) frac14 biological process or molecular function
Cross-Study Comparison Overlap with Our PreviousWork
Following on from the comparison between our gene list andthe candidate genes suggested from the literature we quan-titatively evaluated the degree of overlap between the candi-date genes identified in this study with alleles mapped in ourprevious study of flies collected from the same geographic
12
region (Telonis-Scott et al 2012) The ldquooverlaprdquo analysis wasperformed similarly for the GWAS candidate gene and net-work-level approaches First the gene list overlap was testedbetween the two studies using Fisherrsquos exact test where the382 genes identified from the candidate SNPs (this study)were compared with 416 genes identified from candidatesingle feature polymorphisms (Telonis-Scott et al 2012)The contingency table was constructed using the numberof annotated genes in the D melanogaster R553 genomebuilt as an approximation for the total number of genestested in each study and took the following form
where A frac14 gene list from Telonis-Scott et al (2012)n frac14 416 B frac14 gene list from this study n frac14 382 N frac14 totalnumber of genes in D melanogaster reference genome r553n frac14 17106
To annotate genome features in the overlap candidategene list we used the SNP data (this study) which betterresolves alleles than array-based genotyping Due to this lim-itation we did not compare exact genomic coordinates be-tween the studies but considered overlap to occur whendifferentiated alleles were detected in the same gene inboth studies We did however examine the allele frequenciesaround the Dys-RF promoter which was sequenced sepa-rately revealing a selective sweep in the experimental evolu-tion lines (Telonis-Scott et al 2012)
The overlapping candidates were tested for functional en-richment using a basic gene list approach As genes ratherthan SNPs were analyzed here Fly Mine v400 (wwwflymineorg) was applied with the following parameters Gene lengthnormalization BenjaminindashHochberg FDR correction Plt 005Ontology frac14 biological process and default background (allGO-annotated D melanogaster genes)
Finally we constructed a second PPI network from thecandidate genes mapped in Telonis-Scott et al (2012) to ex-amine system-level overlap between the two studies The 416genes mapped to 337 protein seeds and the network wasconstructed as described above The degree of network over-lap between the two studies was tested using simulation Ineach iteration a gene set of the same length as the list fromthis study and a gene set of the same length as the list fromTelonis-Scott et al (2012) were resampled randomly from theentire D melanogaster mapped gene annotation (r553) Eachresampled set was then used to build a first-order network aspreviously described The overlap network between the tworesampled sets was calculated using the graphintersectioncommand from the igraph package and zero-degree nodes(proteins with no interactions) were removed Overlap wasquantified by counting the number of nodes and the numberof edges in this overlap network The observed overlap wascompared with a simulated null distribution for both nodeand edge number where n frac14 1000 simulation iterations
Supplementary MaterialSupplementary figures S1ndashS4 and tables S1ndashS5 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
We thank Robert Good for the fly sampling and JenniferShirriffs and Lea and Alan Rako for outstanding assistancewith laboratory culture and substantial desiccation assaysWe are grateful to Melissa Davis for advice on network build-ing and comparison approaches and to five anonymous re-viewers whose comments improved the manuscript Thiswork was supported by a DECRA fellowship DE120102575and Monash University Technology voucher scheme toMTS an ARC Future Fellowship and Discovery Grants toCMS an ARC Laureate Fellowship to AAH and a Scienceand Industry Endowment Fund grant to CMS and AAH
ReferencesAlexander JM Lomvardas S 2014 Nuclear architecture as an epigenetic
regulator of neural development and function Neuroscience26439ndash50
Anderson AR Hoffmann AA McKechnie SW Umina PA Weeks AR2005 The latitudinal cline in the In(3R)Payne inversion polymor-phism has shifted in the last 20 years in Australian Drosophilamelanogaster populations Mol Ecol 14851ndash858
Andolfatto P Wall JD Kreitman M 1999 Unusual haplotype structureat the proximal breakpoint of In(2L)t in a natural population ofDrosophila melanogaster Genetics 1531297ndash1311
Andrews S 2014 FastQC Available from httpwwwbioinformaticsbbsrcacukprojectsfastqc
Ashburner M Ball CA Blake JA Botstein D Butler H Cherry JM DavisAP Dolinski K Dwight SS Eppig JT et al 2000 Gene ontology toolfor the unification of biology The Gene Ontology Consortium NatGenet 2525ndash29
Aytes A Mitrofanova A Lefebvre C Alvarez Mariano J Castillo-MartinM Zheng T Eastham James A Gopalan A Pienta Kenneth J ShenMM et al 2014 Cross-species regulatory network analysis identifiesa synergistic interaction between FOXM1 and CENPF that drivesprostate cancer malignancy Cancer Cell 25638ndash651
Bastide H Betancourt A Nolte V Tobler R Steuroobe P Futschik ASchleurootterer C 2013 A genome-wide fine-scale map of natural pig-mentation variation in Drosophila melanogaster PLoS Genet9e1003534
Bergland AO Behrman EL OrsquoBrien KR Schmidt PS Petrov DA 2014Genomic evidence of rapid and stable adaptive oscillations overseasonal time scales in Drosophila PLoS Genet 10e1004775
Bonebrake TC Mastrandrea MD 2010 Tolerance adaptation and pre-cipitation changes complicate latitudinal patterns of climate changeimpacts Proc Natl Acad Sci U S A 10712581ndash12586
Bradley TJ 2009 Animal osmoregulation Oxford Oxford UniversityPress
Burke MK Liti G Long AD 2014 Standing genetic variation drivesrepeatable experimental evolution in outcrossing populations ofSaccharomyces cerevisiae Mol Biol Evol 313228ndash3239
Chown SL 2012 Trait-based approaches to conservation physiologyforecasting environmental change risks from the bottom upPhilos Trans R Soc B 3671615ndash1627
Chown SL Nicolson SW 2004 Insect physiological ecology mechanismsand patterns Oxford Oxford University Press
Chown SL Soslashrensen JG Terblanche JS 2011 Water loss in insectsan environmental change perspective J Insect Physiol571070ndash1084
Not in GWAS In GWAS
Not in Telonis-Scott et al (2012) N ndash intersect-AndashB A-intersect
In Telonis-Scott et al (2012) B-intersect Intersect(AB)
13
Chung H Loehlin DW Dufour HD Vaccarro K Millar JG Carroll SB2014 A single gene affects both ecological divergence and matechoice in Drosophila Science 3431148ndash1151
Cingolani P Platts A Wang LL Coon M Nguyen T Wang L Land SJ LuX Ruden DM 2012 A program for annotating and predicting theeffects of single nucleotide polymorphisms SnpEff SNPs in thegenome of Drosophila melanogaster strain w1118 iso-2 iso-3 Fly(Austin) 680ndash92
Civelek M Lusis AJ 2014 Systems genetics approaches to understandcomplex traits Nat Rev Genet 1534ndash48
Clusella-Trullas S Blackburn TM Chown SL 2011 Climatic predictors oftemperature performance curve parameters in ectotherms implycomplex responses to climate change Am Nat 177738ndash751
Csardi G Nepusz T 2006 The igraph software package for complexnetwork research InterJournal Complex Systems 1695 Availablefrom httpigraphorg
Davies SA Cabrero P Overend G Aitchison L Sebastian S Terhzaz SDow JA 2014 Cell signalling mechanisms for insect stress toleranceJ Exp Biol 217119ndash128
Davies SA Cabrero P Povsic M Johnston NR Terhzaz S Dow JA 2013Signaling by Drosophila capa neuropeptides Gen Comp Endocrinol18860ndash66
Davies SA Overend G Sebastian S Cundall M Cabrero P Dow JATerhzaz S 2012 Immune and stress response lsquocross-talkrsquo in theDrosophila Malpighian tubule J Insect Physiol 58488ndash497
Day JP Dow JA Houslay MD Davies SA 2005 Cyclic nucleotide phos-phodiesterases in Drosophila melanogaster Biochem J 388333ndash342
de Nadal E Ammerer G Posas F 2011 Controlling gene expression inresponse to stress Nat Rev Genet 12833ndash845
DePristo MA Banks E Poplin R Garimella KV Maguire JR Hartl CPhilippakis AA del Angel G Rivas MA Hanna M et al 2011 Aframework for variation discovery and genotyping using next-gen-eration DNA sequencing data Nat Genet 43491ndash498
Djawdan M Chippindale AK Rose MR Bradley TJ 1998 Metabolic re-serves and evolved stress resistance in Drosophila melanogasterPhysiol Zool 71584ndash594
Durham MF Magwire MM Stone EA Leips J 2014 Genome-wide anal-ysis in Drosophila reveals age-specific effects of SNPs on fitness traitsNat Commun 54338
Emilsson V Thorleifsson G Zhang B Leonardson AS Zink F Zhu JCarlson S Helgason A Walters GB Gunnarsdottir S et al 2008Genetics of gene expression and its effect on disease Nature452423ndash428
Fabian DK Kapun M Nolte V Kofler R Schmidt PS Schleurootterer C FlattT 2012 Genome-wide patterns of latitudinal differentiation amongpopulations of Drosophila melanogaster from North America MolEcol 214748ndash4769
Foley BR Telonis-Scott M 2011 Quantitative genetic analysis suggestscausal association between cuticular hydrocarbon composition anddesiccation survival in Drosophila melanogaster Heredity 10668ndash77
Folk DG Han C Bradley TJ 2001 Water acquisition and partitioning inDrosophila melanogaster effects of selection for desiccation-resis-tance J Exp Biol 2043323ndash3331
Fowler MA Montell C 2013 Drosophila TRP channels and animal be-havior Life Sci 92394ndash403
Fu J Keurentjes JJB Bouwmeester H America T Verstappen FWA WardJL Beale MH de Vos RCH Dijkstra M Scheltema RA et al 2009System-wide molecular evidence for phenotypic buffering inArabidopsis Nat Genet 41166ndash167
Futschik A Schleurootterer C 2010 The next generation of molecular mar-kers from massively parallel sequencing of pooled DNA samplesGenetics 186207ndash218
Gibbs AG Fukuzato F Matzkin LM 2003 Evolution of water conserva-tion mechanisms in Drosophila J Exp Biol 2061183ndash1192
Gibbs AG Matzkin LM 2001 Evolution of water balance in the genusDrosophila J Exp Biol 2042331ndash2338
Goodstadt L 2010 Ruffus a lightweight Python library for computa-tional pipelines Bioinformatics 262778ndash2779
Hadley NF 1994 Water relations of terrestrial arthropods San Diego(CA) Academic Press
Hoffmann AA Hallas R Sinclair C Mitrovski P 2001 Levels of variationin stress resistance in Drosophila among strains local populationsand geographic regions patterns for desiccation starvation coldresistance and associated traits Evolution 551621ndash1630
Hoffmann AA Parsons PA 1989a An integrated approach to environ-mental stress tolerance and life-history variation desiccation toler-ance in Drosophila Biol J Linn Soc 37117ndash136
Hoffmann AA Parsons PA 1989b Selection for increased desiccationresistance in Drosophila melanogaster additive genetic control andcorrelated responses for other stresses Genetics 122837ndash845
Hoffmann AA Sgro CM 2011 Climate change and evolutionary adap-tation Nature 470479ndash485
Hoffmann AA Weeks AR 2007 Climatic selection on genes and traitsafter a 100 year-old invasion a critical look at the temperate-tropicalclines in Drosophila melanogaster from eastern Australia Genetica129133ndash147
Huang W Richards S Carbone MA Zhu D Anholt RR Ayroles JFDuncan L Jordan KW Lawrence F Magwire MM et al 2012Epistasis dominates the genetic architecture of Drosophila quanti-tative traits Proc Natl Acad Sci U S A 10915553ndash15559
Huang Z Tunnacliffe A 2004 Response of human cells to desiccationcomparison with hyperosmotic stress response J Physiol558181ndash191
Huang Z Tunnacliffe A 2005 Gene induction by desiccation stress inhuman cell cultures FEBS Lett 5794973ndash4977
Ichihashi Y Aguilar-Martınez JA Farhi M Chitwood DH Kumar RMillon LV Peng J Maloof JN Sinha NR 2014 Evolutionary develop-mental transcriptomics reveals a gene network module regulatinginterspecific diversity in plant leaf shape Proc Natl Acad Sci U S A111E2616ndashE2621
Itoh M Nanba N Hasegawa M Inomata N Kondo R Oshima MTakano-Shimizu T 2010 Seasonal changes in the long-distance link-age disequilibrium in Drosophila melanogaster J Hered 10126ndash32
Johnson WA Carder JW 2012 Drosophila nociceptors mediate larvalaversion to dry surface environments utilizing both the painless TRPchannel and the DEGENaC subunit PPK1 PLoS One 7e32878
Kahsai L Kapan N Dircksen H Winther AM Nassel DR 2010 Metabolicstress responses in Drosophila are modulated by brain neurosecre-tory cells that produce multiple neuropeptides PLoS One 5e11480
Kapun M van Schalkwyk H McAllister B Flatt T Schleurootterer C 2014Inference of chromosomal inversion dynamics from Pool-Seq datain natural and laboratory populations of Drosophila melanogasterMol Ecol 231813ndash1827
Kawano T Shimoda M Matsumoto H Ryuda M Tsuzuki S Hayakawa Y2010 Identification of a gene Desiccate contributing to desicca-tion resistance in Drosophila melanogaster J Biol Chem28538889ndash38897
Kellermann V van Heerwaarden B Sgro CM Hoffmann AA 2009Fundamental evolutionary limits in ecological traits driveDrosophila species distributions Science 3251244ndash1246
Kofler R Nolte V Schleurootterer C 2016 The impact of library preparationprotocols on the consistency of allele frequency estimates in Pool-Seq data Mol Ecol Resour 16118ndash122
Kofler R Pandey RV Schleurootterer C 2011 PoPoolation2 identifying dif-ferentiation between populations using sequencing of pooled DNAsamples (Pool-Seq) Bioinformatics 273435ndash3436
Kofler R Schleurootterer C 2012 Gowinda unbiased analysis of gene setenrichment for genome-wide association studies Bioinformatics282084ndash2085
Leiserson MDM Eldridge JV Ramachandran S Raphael BJ 2013Network analysis of GWAS data Curr Opin Genet Dev 23602ndash610
Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 251754ndash1760
Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079
14
Liu Y Luo J Carlsson MA Neuroassel DR 2015 Serotonin and insulin-likepeptides modulate leucokinin-producing neurons that affectfeeding and water homeostasis in Drosophila J Comp Neurol5231840ndash1863
Lohse M Bolger AM Nagel A Fernie AR Lunn JE Stitt M Usadel B 2012RobiNA a user-friendly integrated software solution for RNA-Seq-based transcriptomics Nucleic Acids Res 40W622ndashW627
Long A Liti G Luptak A Tenaillon O 2015 Elucidating the moleculararchitecture of adaptation via evolve and resequence experimentsNat Rev Genet 16567ndash582
Lynch M Bost D Wilson S Maruki T Harrison S 2014 Population-genetic inference from pooled-sequencing data Genome Biol Evol61210ndash1218
Mackay TFC Stone EA Ayroles JF 2009 The genetics of quantitativetraits challenges and prospects Nat Rev Genet 10565ndash577
Marron MT Markow TA Kain KJ Gibbs AG 2003 Effects of starvationand desiccation on energy metabolism in desert and mesicDrosophila J Insect Physiol 49261ndash270
Matzkin LM Markow TA 2009 Transcriptional regulation of metabo-lism associated with the increased desiccation resistance of thecactophilic Drosophila mojavensis Genetics 1821279ndash1288
Murali T Pacifico S Yu J Guest S Roberts GG Finley RL Jr 2011 DroID2011 a comprehensive integrated resource for protein transcrip-tion factor RNA and gene interactions for Drosophila Nucleic AcidsRes 39D736ndashD743
Nuzhdin SV Turner TL 2013 Promises and limitations of hitchhikingmapping Curr Opin Genet Dev 23694ndash699
Orozco-terWengel P Kapun M Nolte V Kofler R Flatt T Schleurootterer C2012 Adaptation of Drosophila to a novel laboratory environmentreveals temporally heterogeneous trajectories of selected alleles MolEcol 214931ndash4941
Picard Picard A set of Java command line tools for manipulating high-throughput seqeuncing (HTS) data and formats Available fromhttpbroadinstitutegithubiopicard
Qiu Y Tittiger C Wicker-Thomas C Le Goff G Young S Wajnberg EFricaux T Taquet N Blomquist GJ Feyereisen R 2012 An insect-specific P450 oxidative decarbonylase for cuticular hydrocarbon bio-synthesis Proc Natl Acad Sci U S A 10914858ndash14863
Rajpurohit S Oliveira CC Etges WJ Gibbs AG 2013 Functional genomicand phenotypic responses to desiccation in natural populations of adesert Drosophilid Mol Ecol 222698ndash2715
Sarup P Soslashrensen JG Kristensen TN Hoffmann AA Loeschcke V PaigeKN Soslashrensen P 2011 Candidate genes detected in transcriptomestudies are strongly dependent on genetic background PLoS One6e15644
Schiffer M Hangartner S Hoffmann AA 2013 Assessing the relativeimportance of environmental effects carry-over effects and speciesdifferences in thermal stress resistance a comparison ofDrosophilids across field and laboratory generations J Exp Biol2163790ndash3798
Schleurootterer C Kofler R Versace E Tobler R Franssen SU 2015Combining experimental evolution with next-generation sequen-cing A powerful tool to study adaptation from standing geneticvariation Heredity 114431ndash440
Sinclair BJ Gibbs AG Roberts SP 2007 Gene transcription during ex-posure to and recovery from cold and desiccation stress inDrosophila melanogaster Insect Mol Biol 16435ndash443
Sloggett C Wakefield M Philip G Pope B 2014 Rubramdashflexible distrib-uted pipelines figshare Available from httpdxdoiorg106084m9figshare895626
Smit AFA Hubley R Green P 2013-2015 RepeatMasker Open-403Available from httpwwwrepeatmaskerorg
Seurooderberg JA Birse RT Neuroassel DR 2011 Insulin production and signalingin renal tubules of Drosophila is under control of tachykinin-relatedpeptide and regulates stress resistance PLoS One 6e19866
Soslashrensen JG Nielsen MM Loeschcke V 2007 Gene expression profileanalysis of Drosophila melanogaster selected for resistance to envi-ronmental stressors J Evol Biol 201624ndash1636
St Pierre SE Ponting L Stefancsik R McQuilton P 2014 FlyBase102mdashadvanced approaches to interrogating FlyBase Nucleic AcidsRes 42D780ndashD788
Stevenson M Nunes T Heuser C Marshall J Sanchez J Thornton RReiczigel J Robison-Cox J Sebastiani P Solymos P et al 2015 epiRTools for the analysis of epidemiological data R package version09-62 Available from httpCRANR-projectorgpackagefrac14epiR
Storey JD Tibshirani R 2003 Statistical significance for genomewidestudies Proc Natl Acad Sci U S A 1009440ndash9445
Telonis-Scott M Gane M DeGaris S Sgro CM Hoffmann AA 2012 Highresolution mapping of candidate alleles for desiccation resistancein Drosophila melanogaster under selection Mol Biol Evol291335ndash1351
Telonis-Scott M Guthridge KM Hoffmann AA 2006 A new set oflaboratory-selected Drosophila melanogaster lines for the analysisof desiccation resistance response to selection physiology and cor-related responses J Exp Biol 2091837ndash1847
Terhzaz S Cabrero P Robben JH Radford JC Hudson BD Milligan GDow JA Davies SA 2012 Mechanism and function of Drosophilacapa GPCR a desiccation stress-responsive receptor with functionalhomology to human neuromedinU receptor PLoS One 7e29897
Terhzaz S Overend G Sebastian S Dow JA Davies SA 2014 The Dmelanogaster capa-1 neuropeptide activates renal NF-kB signalingPeptides 53218ndash224
Terhzaz S Teets NM Cabrero P Henderson L Ritchie MG Nachman RJDow JA Denlinger DL Davies SA 2015 Insect capa neuropeptidesimpact desiccation and cold tolerance Proc Natl Acad Sci U S A1122882ndash2887
Thorat LJ Gaikwad SM Nath BB 2012 Trehalose as an indicator ofdesiccation stress in Drosophila melanogaster larvae a potentialmarker of anhydrobiosis Biochem Biophys Res Commun419638ndash642
Tobler R Franssen SU Kofler R Orozco-terWengel P Nolte VHermisson J Schl eurootterer C 2014 Massive habitat-specific ge-nomic response in D melanogaster populations during exper-imental evolution in hot and cold environments Mol Biol Evol31364ndash375
Turner TL Bourne EC Von Wettberg EJ Hu TT Nuzhdin SV 2010Population resequencing reveals local adaptation of Arabidopsislyrata to serpentine soils Nat Genet 42260ndash263
Turner TL Miller PM 2012 Investigating natural variation in Drosophilacourtship song by the evolve and resequence approach Genetics191633ndash642
Vermeulen CJ Soslashrensen P Kirilova Gagalova K Loeschcke V 2013Transcriptomic analysis of inbreeding depression in cold-sensitiveDrosophila melanogaster shows upregulation of the immune re-sponse J Evol Biol 261890ndash1902
Wang B Goode J Best J Meltzer J Schilman PE Chen J Garza D ThomasJB Montminy M 2008 The insulin-regulated CREB coactivatorTORC promotes stress resistance in Drosophila Cell Metab7434ndash444
Wang J Kean L Yang J Allan AK Davies SA Herzyk P Dow JA 2004Function-informed transcriptome analysis of Drosophila renaltubule Genome Biol 5R69
Wang K Li M Hakonarson H 2010 Analysing biological pathways ingenome-wide association studies Nat Rev Genet 11843ndash854
Wang L Jia P Wolfinger RD Chen X Zhao Z 2011 Gene set analysis ofgenome-wide association studies methodological issues and per-spectives Genomics 981ndash8
Weber AL Khan GF Magwire MM Tabor CL Mackay TF Anholt RR2012 Genome-wide association analysis of oxidative stress resistancein Drosophila melanogaster PLoS One 7e34745
Weeks AR McKechnie SW Hoffmann AA 2002 Dissecting adaptiveclinal variation markers inversions and sizestress associations inDrosophila melanogaster from a central field population Ecol Lett5756ndash763
Zhu Y Bergland AO Gonzalez J Petrov DA 2012 Empirical validation ofpooled whole genome population re-sequencing in Drosophila mel-anogaster PLoS One 7e41901
15
region (Telonis-Scott et al 2012) The ldquooverlaprdquo analysis wasperformed similarly for the GWAS candidate gene and net-work-level approaches First the gene list overlap was testedbetween the two studies using Fisherrsquos exact test where the382 genes identified from the candidate SNPs (this study)were compared with 416 genes identified from candidatesingle feature polymorphisms (Telonis-Scott et al 2012)The contingency table was constructed using the numberof annotated genes in the D melanogaster R553 genomebuilt as an approximation for the total number of genestested in each study and took the following form
where A frac14 gene list from Telonis-Scott et al (2012)n frac14 416 B frac14 gene list from this study n frac14 382 N frac14 totalnumber of genes in D melanogaster reference genome r553n frac14 17106
To annotate genome features in the overlap candidategene list we used the SNP data (this study) which betterresolves alleles than array-based genotyping Due to this lim-itation we did not compare exact genomic coordinates be-tween the studies but considered overlap to occur whendifferentiated alleles were detected in the same gene inboth studies We did however examine the allele frequenciesaround the Dys-RF promoter which was sequenced sepa-rately revealing a selective sweep in the experimental evolu-tion lines (Telonis-Scott et al 2012)
The overlapping candidates were tested for functional en-richment using a basic gene list approach As genes ratherthan SNPs were analyzed here Fly Mine v400 (wwwflymineorg) was applied with the following parameters Gene lengthnormalization BenjaminindashHochberg FDR correction Plt 005Ontology frac14 biological process and default background (allGO-annotated D melanogaster genes)
Finally we constructed a second PPI network from thecandidate genes mapped in Telonis-Scott et al (2012) to ex-amine system-level overlap between the two studies The 416genes mapped to 337 protein seeds and the network wasconstructed as described above The degree of network over-lap between the two studies was tested using simulation Ineach iteration a gene set of the same length as the list fromthis study and a gene set of the same length as the list fromTelonis-Scott et al (2012) were resampled randomly from theentire D melanogaster mapped gene annotation (r553) Eachresampled set was then used to build a first-order network aspreviously described The overlap network between the tworesampled sets was calculated using the graphintersectioncommand from the igraph package and zero-degree nodes(proteins with no interactions) were removed Overlap wasquantified by counting the number of nodes and the numberof edges in this overlap network The observed overlap wascompared with a simulated null distribution for both nodeand edge number where n frac14 1000 simulation iterations
Supplementary MaterialSupplementary figures S1ndashS4 and tables S1ndashS5 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
We thank Robert Good for the fly sampling and JenniferShirriffs and Lea and Alan Rako for outstanding assistancewith laboratory culture and substantial desiccation assaysWe are grateful to Melissa Davis for advice on network build-ing and comparison approaches and to five anonymous re-viewers whose comments improved the manuscript Thiswork was supported by a DECRA fellowship DE120102575and Monash University Technology voucher scheme toMTS an ARC Future Fellowship and Discovery Grants toCMS an ARC Laureate Fellowship to AAH and a Scienceand Industry Endowment Fund grant to CMS and AAH
ReferencesAlexander JM Lomvardas S 2014 Nuclear architecture as an epigenetic
regulator of neural development and function Neuroscience26439ndash50
Anderson AR Hoffmann AA McKechnie SW Umina PA Weeks AR2005 The latitudinal cline in the In(3R)Payne inversion polymor-phism has shifted in the last 20 years in Australian Drosophilamelanogaster populations Mol Ecol 14851ndash858
Andolfatto P Wall JD Kreitman M 1999 Unusual haplotype structureat the proximal breakpoint of In(2L)t in a natural population ofDrosophila melanogaster Genetics 1531297ndash1311
Andrews S 2014 FastQC Available from httpwwwbioinformaticsbbsrcacukprojectsfastqc
Ashburner M Ball CA Blake JA Botstein D Butler H Cherry JM DavisAP Dolinski K Dwight SS Eppig JT et al 2000 Gene ontology toolfor the unification of biology The Gene Ontology Consortium NatGenet 2525ndash29
Aytes A Mitrofanova A Lefebvre C Alvarez Mariano J Castillo-MartinM Zheng T Eastham James A Gopalan A Pienta Kenneth J ShenMM et al 2014 Cross-species regulatory network analysis identifiesa synergistic interaction between FOXM1 and CENPF that drivesprostate cancer malignancy Cancer Cell 25638ndash651
Bastide H Betancourt A Nolte V Tobler R Steuroobe P Futschik ASchleurootterer C 2013 A genome-wide fine-scale map of natural pig-mentation variation in Drosophila melanogaster PLoS Genet9e1003534
Bergland AO Behrman EL OrsquoBrien KR Schmidt PS Petrov DA 2014Genomic evidence of rapid and stable adaptive oscillations overseasonal time scales in Drosophila PLoS Genet 10e1004775
Bonebrake TC Mastrandrea MD 2010 Tolerance adaptation and pre-cipitation changes complicate latitudinal patterns of climate changeimpacts Proc Natl Acad Sci U S A 10712581ndash12586
Bradley TJ 2009 Animal osmoregulation Oxford Oxford UniversityPress
Burke MK Liti G Long AD 2014 Standing genetic variation drivesrepeatable experimental evolution in outcrossing populations ofSaccharomyces cerevisiae Mol Biol Evol 313228ndash3239
Chown SL 2012 Trait-based approaches to conservation physiologyforecasting environmental change risks from the bottom upPhilos Trans R Soc B 3671615ndash1627
Chown SL Nicolson SW 2004 Insect physiological ecology mechanismsand patterns Oxford Oxford University Press
Chown SL Soslashrensen JG Terblanche JS 2011 Water loss in insectsan environmental change perspective J Insect Physiol571070ndash1084
Not in GWAS In GWAS
Not in Telonis-Scott et al (2012) N ndash intersect-AndashB A-intersect
In Telonis-Scott et al (2012) B-intersect Intersect(AB)
13
Chung H Loehlin DW Dufour HD Vaccarro K Millar JG Carroll SB2014 A single gene affects both ecological divergence and matechoice in Drosophila Science 3431148ndash1151
Cingolani P Platts A Wang LL Coon M Nguyen T Wang L Land SJ LuX Ruden DM 2012 A program for annotating and predicting theeffects of single nucleotide polymorphisms SnpEff SNPs in thegenome of Drosophila melanogaster strain w1118 iso-2 iso-3 Fly(Austin) 680ndash92
Civelek M Lusis AJ 2014 Systems genetics approaches to understandcomplex traits Nat Rev Genet 1534ndash48
Clusella-Trullas S Blackburn TM Chown SL 2011 Climatic predictors oftemperature performance curve parameters in ectotherms implycomplex responses to climate change Am Nat 177738ndash751
Csardi G Nepusz T 2006 The igraph software package for complexnetwork research InterJournal Complex Systems 1695 Availablefrom httpigraphorg
Davies SA Cabrero P Overend G Aitchison L Sebastian S Terhzaz SDow JA 2014 Cell signalling mechanisms for insect stress toleranceJ Exp Biol 217119ndash128
Davies SA Cabrero P Povsic M Johnston NR Terhzaz S Dow JA 2013Signaling by Drosophila capa neuropeptides Gen Comp Endocrinol18860ndash66
Davies SA Overend G Sebastian S Cundall M Cabrero P Dow JATerhzaz S 2012 Immune and stress response lsquocross-talkrsquo in theDrosophila Malpighian tubule J Insect Physiol 58488ndash497
Day JP Dow JA Houslay MD Davies SA 2005 Cyclic nucleotide phos-phodiesterases in Drosophila melanogaster Biochem J 388333ndash342
de Nadal E Ammerer G Posas F 2011 Controlling gene expression inresponse to stress Nat Rev Genet 12833ndash845
DePristo MA Banks E Poplin R Garimella KV Maguire JR Hartl CPhilippakis AA del Angel G Rivas MA Hanna M et al 2011 Aframework for variation discovery and genotyping using next-gen-eration DNA sequencing data Nat Genet 43491ndash498
Djawdan M Chippindale AK Rose MR Bradley TJ 1998 Metabolic re-serves and evolved stress resistance in Drosophila melanogasterPhysiol Zool 71584ndash594
Durham MF Magwire MM Stone EA Leips J 2014 Genome-wide anal-ysis in Drosophila reveals age-specific effects of SNPs on fitness traitsNat Commun 54338
Emilsson V Thorleifsson G Zhang B Leonardson AS Zink F Zhu JCarlson S Helgason A Walters GB Gunnarsdottir S et al 2008Genetics of gene expression and its effect on disease Nature452423ndash428
Fabian DK Kapun M Nolte V Kofler R Schmidt PS Schleurootterer C FlattT 2012 Genome-wide patterns of latitudinal differentiation amongpopulations of Drosophila melanogaster from North America MolEcol 214748ndash4769
Foley BR Telonis-Scott M 2011 Quantitative genetic analysis suggestscausal association between cuticular hydrocarbon composition anddesiccation survival in Drosophila melanogaster Heredity 10668ndash77
Folk DG Han C Bradley TJ 2001 Water acquisition and partitioning inDrosophila melanogaster effects of selection for desiccation-resis-tance J Exp Biol 2043323ndash3331
Fowler MA Montell C 2013 Drosophila TRP channels and animal be-havior Life Sci 92394ndash403
Fu J Keurentjes JJB Bouwmeester H America T Verstappen FWA WardJL Beale MH de Vos RCH Dijkstra M Scheltema RA et al 2009System-wide molecular evidence for phenotypic buffering inArabidopsis Nat Genet 41166ndash167
Futschik A Schleurootterer C 2010 The next generation of molecular mar-kers from massively parallel sequencing of pooled DNA samplesGenetics 186207ndash218
Gibbs AG Fukuzato F Matzkin LM 2003 Evolution of water conserva-tion mechanisms in Drosophila J Exp Biol 2061183ndash1192
Gibbs AG Matzkin LM 2001 Evolution of water balance in the genusDrosophila J Exp Biol 2042331ndash2338
Goodstadt L 2010 Ruffus a lightweight Python library for computa-tional pipelines Bioinformatics 262778ndash2779
Hadley NF 1994 Water relations of terrestrial arthropods San Diego(CA) Academic Press
Hoffmann AA Hallas R Sinclair C Mitrovski P 2001 Levels of variationin stress resistance in Drosophila among strains local populationsand geographic regions patterns for desiccation starvation coldresistance and associated traits Evolution 551621ndash1630
Hoffmann AA Parsons PA 1989a An integrated approach to environ-mental stress tolerance and life-history variation desiccation toler-ance in Drosophila Biol J Linn Soc 37117ndash136
Hoffmann AA Parsons PA 1989b Selection for increased desiccationresistance in Drosophila melanogaster additive genetic control andcorrelated responses for other stresses Genetics 122837ndash845
Hoffmann AA Sgro CM 2011 Climate change and evolutionary adap-tation Nature 470479ndash485
Hoffmann AA Weeks AR 2007 Climatic selection on genes and traitsafter a 100 year-old invasion a critical look at the temperate-tropicalclines in Drosophila melanogaster from eastern Australia Genetica129133ndash147
Huang W Richards S Carbone MA Zhu D Anholt RR Ayroles JFDuncan L Jordan KW Lawrence F Magwire MM et al 2012Epistasis dominates the genetic architecture of Drosophila quanti-tative traits Proc Natl Acad Sci U S A 10915553ndash15559
Huang Z Tunnacliffe A 2004 Response of human cells to desiccationcomparison with hyperosmotic stress response J Physiol558181ndash191
Huang Z Tunnacliffe A 2005 Gene induction by desiccation stress inhuman cell cultures FEBS Lett 5794973ndash4977
Ichihashi Y Aguilar-Martınez JA Farhi M Chitwood DH Kumar RMillon LV Peng J Maloof JN Sinha NR 2014 Evolutionary develop-mental transcriptomics reveals a gene network module regulatinginterspecific diversity in plant leaf shape Proc Natl Acad Sci U S A111E2616ndashE2621
Itoh M Nanba N Hasegawa M Inomata N Kondo R Oshima MTakano-Shimizu T 2010 Seasonal changes in the long-distance link-age disequilibrium in Drosophila melanogaster J Hered 10126ndash32
Johnson WA Carder JW 2012 Drosophila nociceptors mediate larvalaversion to dry surface environments utilizing both the painless TRPchannel and the DEGENaC subunit PPK1 PLoS One 7e32878
Kahsai L Kapan N Dircksen H Winther AM Nassel DR 2010 Metabolicstress responses in Drosophila are modulated by brain neurosecre-tory cells that produce multiple neuropeptides PLoS One 5e11480
Kapun M van Schalkwyk H McAllister B Flatt T Schleurootterer C 2014Inference of chromosomal inversion dynamics from Pool-Seq datain natural and laboratory populations of Drosophila melanogasterMol Ecol 231813ndash1827
Kawano T Shimoda M Matsumoto H Ryuda M Tsuzuki S Hayakawa Y2010 Identification of a gene Desiccate contributing to desicca-tion resistance in Drosophila melanogaster J Biol Chem28538889ndash38897
Kellermann V van Heerwaarden B Sgro CM Hoffmann AA 2009Fundamental evolutionary limits in ecological traits driveDrosophila species distributions Science 3251244ndash1246
Kofler R Nolte V Schleurootterer C 2016 The impact of library preparationprotocols on the consistency of allele frequency estimates in Pool-Seq data Mol Ecol Resour 16118ndash122
Kofler R Pandey RV Schleurootterer C 2011 PoPoolation2 identifying dif-ferentiation between populations using sequencing of pooled DNAsamples (Pool-Seq) Bioinformatics 273435ndash3436
Kofler R Schleurootterer C 2012 Gowinda unbiased analysis of gene setenrichment for genome-wide association studies Bioinformatics282084ndash2085
Leiserson MDM Eldridge JV Ramachandran S Raphael BJ 2013Network analysis of GWAS data Curr Opin Genet Dev 23602ndash610
Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 251754ndash1760
Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079
14
Liu Y Luo J Carlsson MA Neuroassel DR 2015 Serotonin and insulin-likepeptides modulate leucokinin-producing neurons that affectfeeding and water homeostasis in Drosophila J Comp Neurol5231840ndash1863
Lohse M Bolger AM Nagel A Fernie AR Lunn JE Stitt M Usadel B 2012RobiNA a user-friendly integrated software solution for RNA-Seq-based transcriptomics Nucleic Acids Res 40W622ndashW627
Long A Liti G Luptak A Tenaillon O 2015 Elucidating the moleculararchitecture of adaptation via evolve and resequence experimentsNat Rev Genet 16567ndash582
Lynch M Bost D Wilson S Maruki T Harrison S 2014 Population-genetic inference from pooled-sequencing data Genome Biol Evol61210ndash1218
Mackay TFC Stone EA Ayroles JF 2009 The genetics of quantitativetraits challenges and prospects Nat Rev Genet 10565ndash577
Marron MT Markow TA Kain KJ Gibbs AG 2003 Effects of starvationand desiccation on energy metabolism in desert and mesicDrosophila J Insect Physiol 49261ndash270
Matzkin LM Markow TA 2009 Transcriptional regulation of metabo-lism associated with the increased desiccation resistance of thecactophilic Drosophila mojavensis Genetics 1821279ndash1288
Murali T Pacifico S Yu J Guest S Roberts GG Finley RL Jr 2011 DroID2011 a comprehensive integrated resource for protein transcrip-tion factor RNA and gene interactions for Drosophila Nucleic AcidsRes 39D736ndashD743
Nuzhdin SV Turner TL 2013 Promises and limitations of hitchhikingmapping Curr Opin Genet Dev 23694ndash699
Orozco-terWengel P Kapun M Nolte V Kofler R Flatt T Schleurootterer C2012 Adaptation of Drosophila to a novel laboratory environmentreveals temporally heterogeneous trajectories of selected alleles MolEcol 214931ndash4941
Picard Picard A set of Java command line tools for manipulating high-throughput seqeuncing (HTS) data and formats Available fromhttpbroadinstitutegithubiopicard
Qiu Y Tittiger C Wicker-Thomas C Le Goff G Young S Wajnberg EFricaux T Taquet N Blomquist GJ Feyereisen R 2012 An insect-specific P450 oxidative decarbonylase for cuticular hydrocarbon bio-synthesis Proc Natl Acad Sci U S A 10914858ndash14863
Rajpurohit S Oliveira CC Etges WJ Gibbs AG 2013 Functional genomicand phenotypic responses to desiccation in natural populations of adesert Drosophilid Mol Ecol 222698ndash2715
Sarup P Soslashrensen JG Kristensen TN Hoffmann AA Loeschcke V PaigeKN Soslashrensen P 2011 Candidate genes detected in transcriptomestudies are strongly dependent on genetic background PLoS One6e15644
Schiffer M Hangartner S Hoffmann AA 2013 Assessing the relativeimportance of environmental effects carry-over effects and speciesdifferences in thermal stress resistance a comparison ofDrosophilids across field and laboratory generations J Exp Biol2163790ndash3798
Schleurootterer C Kofler R Versace E Tobler R Franssen SU 2015Combining experimental evolution with next-generation sequen-cing A powerful tool to study adaptation from standing geneticvariation Heredity 114431ndash440
Sinclair BJ Gibbs AG Roberts SP 2007 Gene transcription during ex-posure to and recovery from cold and desiccation stress inDrosophila melanogaster Insect Mol Biol 16435ndash443
Sloggett C Wakefield M Philip G Pope B 2014 Rubramdashflexible distrib-uted pipelines figshare Available from httpdxdoiorg106084m9figshare895626
Smit AFA Hubley R Green P 2013-2015 RepeatMasker Open-403Available from httpwwwrepeatmaskerorg
Seurooderberg JA Birse RT Neuroassel DR 2011 Insulin production and signalingin renal tubules of Drosophila is under control of tachykinin-relatedpeptide and regulates stress resistance PLoS One 6e19866
Soslashrensen JG Nielsen MM Loeschcke V 2007 Gene expression profileanalysis of Drosophila melanogaster selected for resistance to envi-ronmental stressors J Evol Biol 201624ndash1636
St Pierre SE Ponting L Stefancsik R McQuilton P 2014 FlyBase102mdashadvanced approaches to interrogating FlyBase Nucleic AcidsRes 42D780ndashD788
Stevenson M Nunes T Heuser C Marshall J Sanchez J Thornton RReiczigel J Robison-Cox J Sebastiani P Solymos P et al 2015 epiRTools for the analysis of epidemiological data R package version09-62 Available from httpCRANR-projectorgpackagefrac14epiR
Storey JD Tibshirani R 2003 Statistical significance for genomewidestudies Proc Natl Acad Sci U S A 1009440ndash9445
Telonis-Scott M Gane M DeGaris S Sgro CM Hoffmann AA 2012 Highresolution mapping of candidate alleles for desiccation resistancein Drosophila melanogaster under selection Mol Biol Evol291335ndash1351
Telonis-Scott M Guthridge KM Hoffmann AA 2006 A new set oflaboratory-selected Drosophila melanogaster lines for the analysisof desiccation resistance response to selection physiology and cor-related responses J Exp Biol 2091837ndash1847
Terhzaz S Cabrero P Robben JH Radford JC Hudson BD Milligan GDow JA Davies SA 2012 Mechanism and function of Drosophilacapa GPCR a desiccation stress-responsive receptor with functionalhomology to human neuromedinU receptor PLoS One 7e29897
Terhzaz S Overend G Sebastian S Dow JA Davies SA 2014 The Dmelanogaster capa-1 neuropeptide activates renal NF-kB signalingPeptides 53218ndash224
Terhzaz S Teets NM Cabrero P Henderson L Ritchie MG Nachman RJDow JA Denlinger DL Davies SA 2015 Insect capa neuropeptidesimpact desiccation and cold tolerance Proc Natl Acad Sci U S A1122882ndash2887
Thorat LJ Gaikwad SM Nath BB 2012 Trehalose as an indicator ofdesiccation stress in Drosophila melanogaster larvae a potentialmarker of anhydrobiosis Biochem Biophys Res Commun419638ndash642
Tobler R Franssen SU Kofler R Orozco-terWengel P Nolte VHermisson J Schl eurootterer C 2014 Massive habitat-specific ge-nomic response in D melanogaster populations during exper-imental evolution in hot and cold environments Mol Biol Evol31364ndash375
Turner TL Bourne EC Von Wettberg EJ Hu TT Nuzhdin SV 2010Population resequencing reveals local adaptation of Arabidopsislyrata to serpentine soils Nat Genet 42260ndash263
Turner TL Miller PM 2012 Investigating natural variation in Drosophilacourtship song by the evolve and resequence approach Genetics191633ndash642
Vermeulen CJ Soslashrensen P Kirilova Gagalova K Loeschcke V 2013Transcriptomic analysis of inbreeding depression in cold-sensitiveDrosophila melanogaster shows upregulation of the immune re-sponse J Evol Biol 261890ndash1902
Wang B Goode J Best J Meltzer J Schilman PE Chen J Garza D ThomasJB Montminy M 2008 The insulin-regulated CREB coactivatorTORC promotes stress resistance in Drosophila Cell Metab7434ndash444
Wang J Kean L Yang J Allan AK Davies SA Herzyk P Dow JA 2004Function-informed transcriptome analysis of Drosophila renaltubule Genome Biol 5R69
Wang K Li M Hakonarson H 2010 Analysing biological pathways ingenome-wide association studies Nat Rev Genet 11843ndash854
Wang L Jia P Wolfinger RD Chen X Zhao Z 2011 Gene set analysis ofgenome-wide association studies methodological issues and per-spectives Genomics 981ndash8
Weber AL Khan GF Magwire MM Tabor CL Mackay TF Anholt RR2012 Genome-wide association analysis of oxidative stress resistancein Drosophila melanogaster PLoS One 7e34745
Weeks AR McKechnie SW Hoffmann AA 2002 Dissecting adaptiveclinal variation markers inversions and sizestress associations inDrosophila melanogaster from a central field population Ecol Lett5756ndash763
Zhu Y Bergland AO Gonzalez J Petrov DA 2012 Empirical validation ofpooled whole genome population re-sequencing in Drosophila mel-anogaster PLoS One 7e41901
15
Chung H Loehlin DW Dufour HD Vaccarro K Millar JG Carroll SB2014 A single gene affects both ecological divergence and matechoice in Drosophila Science 3431148ndash1151
Cingolani P Platts A Wang LL Coon M Nguyen T Wang L Land SJ LuX Ruden DM 2012 A program for annotating and predicting theeffects of single nucleotide polymorphisms SnpEff SNPs in thegenome of Drosophila melanogaster strain w1118 iso-2 iso-3 Fly(Austin) 680ndash92
Civelek M Lusis AJ 2014 Systems genetics approaches to understandcomplex traits Nat Rev Genet 1534ndash48
Clusella-Trullas S Blackburn TM Chown SL 2011 Climatic predictors oftemperature performance curve parameters in ectotherms implycomplex responses to climate change Am Nat 177738ndash751
Csardi G Nepusz T 2006 The igraph software package for complexnetwork research InterJournal Complex Systems 1695 Availablefrom httpigraphorg
Davies SA Cabrero P Overend G Aitchison L Sebastian S Terhzaz SDow JA 2014 Cell signalling mechanisms for insect stress toleranceJ Exp Biol 217119ndash128
Davies SA Cabrero P Povsic M Johnston NR Terhzaz S Dow JA 2013Signaling by Drosophila capa neuropeptides Gen Comp Endocrinol18860ndash66
Davies SA Overend G Sebastian S Cundall M Cabrero P Dow JATerhzaz S 2012 Immune and stress response lsquocross-talkrsquo in theDrosophila Malpighian tubule J Insect Physiol 58488ndash497
Day JP Dow JA Houslay MD Davies SA 2005 Cyclic nucleotide phos-phodiesterases in Drosophila melanogaster Biochem J 388333ndash342
de Nadal E Ammerer G Posas F 2011 Controlling gene expression inresponse to stress Nat Rev Genet 12833ndash845
DePristo MA Banks E Poplin R Garimella KV Maguire JR Hartl CPhilippakis AA del Angel G Rivas MA Hanna M et al 2011 Aframework for variation discovery and genotyping using next-gen-eration DNA sequencing data Nat Genet 43491ndash498
Djawdan M Chippindale AK Rose MR Bradley TJ 1998 Metabolic re-serves and evolved stress resistance in Drosophila melanogasterPhysiol Zool 71584ndash594
Durham MF Magwire MM Stone EA Leips J 2014 Genome-wide anal-ysis in Drosophila reveals age-specific effects of SNPs on fitness traitsNat Commun 54338
Emilsson V Thorleifsson G Zhang B Leonardson AS Zink F Zhu JCarlson S Helgason A Walters GB Gunnarsdottir S et al 2008Genetics of gene expression and its effect on disease Nature452423ndash428
Fabian DK Kapun M Nolte V Kofler R Schmidt PS Schleurootterer C FlattT 2012 Genome-wide patterns of latitudinal differentiation amongpopulations of Drosophila melanogaster from North America MolEcol 214748ndash4769
Foley BR Telonis-Scott M 2011 Quantitative genetic analysis suggestscausal association between cuticular hydrocarbon composition anddesiccation survival in Drosophila melanogaster Heredity 10668ndash77
Folk DG Han C Bradley TJ 2001 Water acquisition and partitioning inDrosophila melanogaster effects of selection for desiccation-resis-tance J Exp Biol 2043323ndash3331
Fowler MA Montell C 2013 Drosophila TRP channels and animal be-havior Life Sci 92394ndash403
Fu J Keurentjes JJB Bouwmeester H America T Verstappen FWA WardJL Beale MH de Vos RCH Dijkstra M Scheltema RA et al 2009System-wide molecular evidence for phenotypic buffering inArabidopsis Nat Genet 41166ndash167
Futschik A Schleurootterer C 2010 The next generation of molecular mar-kers from massively parallel sequencing of pooled DNA samplesGenetics 186207ndash218
Gibbs AG Fukuzato F Matzkin LM 2003 Evolution of water conserva-tion mechanisms in Drosophila J Exp Biol 2061183ndash1192
Gibbs AG Matzkin LM 2001 Evolution of water balance in the genusDrosophila J Exp Biol 2042331ndash2338
Goodstadt L 2010 Ruffus a lightweight Python library for computa-tional pipelines Bioinformatics 262778ndash2779
Hadley NF 1994 Water relations of terrestrial arthropods San Diego(CA) Academic Press
Hoffmann AA Hallas R Sinclair C Mitrovski P 2001 Levels of variationin stress resistance in Drosophila among strains local populationsand geographic regions patterns for desiccation starvation coldresistance and associated traits Evolution 551621ndash1630
Hoffmann AA Parsons PA 1989a An integrated approach to environ-mental stress tolerance and life-history variation desiccation toler-ance in Drosophila Biol J Linn Soc 37117ndash136
Hoffmann AA Parsons PA 1989b Selection for increased desiccationresistance in Drosophila melanogaster additive genetic control andcorrelated responses for other stresses Genetics 122837ndash845
Hoffmann AA Sgro CM 2011 Climate change and evolutionary adap-tation Nature 470479ndash485
Hoffmann AA Weeks AR 2007 Climatic selection on genes and traitsafter a 100 year-old invasion a critical look at the temperate-tropicalclines in Drosophila melanogaster from eastern Australia Genetica129133ndash147
Huang W Richards S Carbone MA Zhu D Anholt RR Ayroles JFDuncan L Jordan KW Lawrence F Magwire MM et al 2012Epistasis dominates the genetic architecture of Drosophila quanti-tative traits Proc Natl Acad Sci U S A 10915553ndash15559
Huang Z Tunnacliffe A 2004 Response of human cells to desiccationcomparison with hyperosmotic stress response J Physiol558181ndash191
Huang Z Tunnacliffe A 2005 Gene induction by desiccation stress inhuman cell cultures FEBS Lett 5794973ndash4977
Ichihashi Y Aguilar-Martınez JA Farhi M Chitwood DH Kumar RMillon LV Peng J Maloof JN Sinha NR 2014 Evolutionary develop-mental transcriptomics reveals a gene network module regulatinginterspecific diversity in plant leaf shape Proc Natl Acad Sci U S A111E2616ndashE2621
Itoh M Nanba N Hasegawa M Inomata N Kondo R Oshima MTakano-Shimizu T 2010 Seasonal changes in the long-distance link-age disequilibrium in Drosophila melanogaster J Hered 10126ndash32
Johnson WA Carder JW 2012 Drosophila nociceptors mediate larvalaversion to dry surface environments utilizing both the painless TRPchannel and the DEGENaC subunit PPK1 PLoS One 7e32878
Kahsai L Kapan N Dircksen H Winther AM Nassel DR 2010 Metabolicstress responses in Drosophila are modulated by brain neurosecre-tory cells that produce multiple neuropeptides PLoS One 5e11480
Kapun M van Schalkwyk H McAllister B Flatt T Schleurootterer C 2014Inference of chromosomal inversion dynamics from Pool-Seq datain natural and laboratory populations of Drosophila melanogasterMol Ecol 231813ndash1827
Kawano T Shimoda M Matsumoto H Ryuda M Tsuzuki S Hayakawa Y2010 Identification of a gene Desiccate contributing to desicca-tion resistance in Drosophila melanogaster J Biol Chem28538889ndash38897
Kellermann V van Heerwaarden B Sgro CM Hoffmann AA 2009Fundamental evolutionary limits in ecological traits driveDrosophila species distributions Science 3251244ndash1246
Kofler R Nolte V Schleurootterer C 2016 The impact of library preparationprotocols on the consistency of allele frequency estimates in Pool-Seq data Mol Ecol Resour 16118ndash122
Kofler R Pandey RV Schleurootterer C 2011 PoPoolation2 identifying dif-ferentiation between populations using sequencing of pooled DNAsamples (Pool-Seq) Bioinformatics 273435ndash3436
Kofler R Schleurootterer C 2012 Gowinda unbiased analysis of gene setenrichment for genome-wide association studies Bioinformatics282084ndash2085
Leiserson MDM Eldridge JV Ramachandran S Raphael BJ 2013Network analysis of GWAS data Curr Opin Genet Dev 23602ndash610
Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 251754ndash1760
Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 252078ndash2079
14
Liu Y Luo J Carlsson MA Neuroassel DR 2015 Serotonin and insulin-likepeptides modulate leucokinin-producing neurons that affectfeeding and water homeostasis in Drosophila J Comp Neurol5231840ndash1863
Lohse M Bolger AM Nagel A Fernie AR Lunn JE Stitt M Usadel B 2012RobiNA a user-friendly integrated software solution for RNA-Seq-based transcriptomics Nucleic Acids Res 40W622ndashW627
Long A Liti G Luptak A Tenaillon O 2015 Elucidating the moleculararchitecture of adaptation via evolve and resequence experimentsNat Rev Genet 16567ndash582
Lynch M Bost D Wilson S Maruki T Harrison S 2014 Population-genetic inference from pooled-sequencing data Genome Biol Evol61210ndash1218
Mackay TFC Stone EA Ayroles JF 2009 The genetics of quantitativetraits challenges and prospects Nat Rev Genet 10565ndash577
Marron MT Markow TA Kain KJ Gibbs AG 2003 Effects of starvationand desiccation on energy metabolism in desert and mesicDrosophila J Insect Physiol 49261ndash270
Matzkin LM Markow TA 2009 Transcriptional regulation of metabo-lism associated with the increased desiccation resistance of thecactophilic Drosophila mojavensis Genetics 1821279ndash1288
Murali T Pacifico S Yu J Guest S Roberts GG Finley RL Jr 2011 DroID2011 a comprehensive integrated resource for protein transcrip-tion factor RNA and gene interactions for Drosophila Nucleic AcidsRes 39D736ndashD743
Nuzhdin SV Turner TL 2013 Promises and limitations of hitchhikingmapping Curr Opin Genet Dev 23694ndash699
Orozco-terWengel P Kapun M Nolte V Kofler R Flatt T Schleurootterer C2012 Adaptation of Drosophila to a novel laboratory environmentreveals temporally heterogeneous trajectories of selected alleles MolEcol 214931ndash4941
Picard Picard A set of Java command line tools for manipulating high-throughput seqeuncing (HTS) data and formats Available fromhttpbroadinstitutegithubiopicard
Qiu Y Tittiger C Wicker-Thomas C Le Goff G Young S Wajnberg EFricaux T Taquet N Blomquist GJ Feyereisen R 2012 An insect-specific P450 oxidative decarbonylase for cuticular hydrocarbon bio-synthesis Proc Natl Acad Sci U S A 10914858ndash14863
Rajpurohit S Oliveira CC Etges WJ Gibbs AG 2013 Functional genomicand phenotypic responses to desiccation in natural populations of adesert Drosophilid Mol Ecol 222698ndash2715
Sarup P Soslashrensen JG Kristensen TN Hoffmann AA Loeschcke V PaigeKN Soslashrensen P 2011 Candidate genes detected in transcriptomestudies are strongly dependent on genetic background PLoS One6e15644
Schiffer M Hangartner S Hoffmann AA 2013 Assessing the relativeimportance of environmental effects carry-over effects and speciesdifferences in thermal stress resistance a comparison ofDrosophilids across field and laboratory generations J Exp Biol2163790ndash3798
Schleurootterer C Kofler R Versace E Tobler R Franssen SU 2015Combining experimental evolution with next-generation sequen-cing A powerful tool to study adaptation from standing geneticvariation Heredity 114431ndash440
Sinclair BJ Gibbs AG Roberts SP 2007 Gene transcription during ex-posure to and recovery from cold and desiccation stress inDrosophila melanogaster Insect Mol Biol 16435ndash443
Sloggett C Wakefield M Philip G Pope B 2014 Rubramdashflexible distrib-uted pipelines figshare Available from httpdxdoiorg106084m9figshare895626
Smit AFA Hubley R Green P 2013-2015 RepeatMasker Open-403Available from httpwwwrepeatmaskerorg
Seurooderberg JA Birse RT Neuroassel DR 2011 Insulin production and signalingin renal tubules of Drosophila is under control of tachykinin-relatedpeptide and regulates stress resistance PLoS One 6e19866
Soslashrensen JG Nielsen MM Loeschcke V 2007 Gene expression profileanalysis of Drosophila melanogaster selected for resistance to envi-ronmental stressors J Evol Biol 201624ndash1636
St Pierre SE Ponting L Stefancsik R McQuilton P 2014 FlyBase102mdashadvanced approaches to interrogating FlyBase Nucleic AcidsRes 42D780ndashD788
Stevenson M Nunes T Heuser C Marshall J Sanchez J Thornton RReiczigel J Robison-Cox J Sebastiani P Solymos P et al 2015 epiRTools for the analysis of epidemiological data R package version09-62 Available from httpCRANR-projectorgpackagefrac14epiR
Storey JD Tibshirani R 2003 Statistical significance for genomewidestudies Proc Natl Acad Sci U S A 1009440ndash9445
Telonis-Scott M Gane M DeGaris S Sgro CM Hoffmann AA 2012 Highresolution mapping of candidate alleles for desiccation resistancein Drosophila melanogaster under selection Mol Biol Evol291335ndash1351
Telonis-Scott M Guthridge KM Hoffmann AA 2006 A new set oflaboratory-selected Drosophila melanogaster lines for the analysisof desiccation resistance response to selection physiology and cor-related responses J Exp Biol 2091837ndash1847
Terhzaz S Cabrero P Robben JH Radford JC Hudson BD Milligan GDow JA Davies SA 2012 Mechanism and function of Drosophilacapa GPCR a desiccation stress-responsive receptor with functionalhomology to human neuromedinU receptor PLoS One 7e29897
Terhzaz S Overend G Sebastian S Dow JA Davies SA 2014 The Dmelanogaster capa-1 neuropeptide activates renal NF-kB signalingPeptides 53218ndash224
Terhzaz S Teets NM Cabrero P Henderson L Ritchie MG Nachman RJDow JA Denlinger DL Davies SA 2015 Insect capa neuropeptidesimpact desiccation and cold tolerance Proc Natl Acad Sci U S A1122882ndash2887
Thorat LJ Gaikwad SM Nath BB 2012 Trehalose as an indicator ofdesiccation stress in Drosophila melanogaster larvae a potentialmarker of anhydrobiosis Biochem Biophys Res Commun419638ndash642
Tobler R Franssen SU Kofler R Orozco-terWengel P Nolte VHermisson J Schl eurootterer C 2014 Massive habitat-specific ge-nomic response in D melanogaster populations during exper-imental evolution in hot and cold environments Mol Biol Evol31364ndash375
Turner TL Bourne EC Von Wettberg EJ Hu TT Nuzhdin SV 2010Population resequencing reveals local adaptation of Arabidopsislyrata to serpentine soils Nat Genet 42260ndash263
Turner TL Miller PM 2012 Investigating natural variation in Drosophilacourtship song by the evolve and resequence approach Genetics191633ndash642
Vermeulen CJ Soslashrensen P Kirilova Gagalova K Loeschcke V 2013Transcriptomic analysis of inbreeding depression in cold-sensitiveDrosophila melanogaster shows upregulation of the immune re-sponse J Evol Biol 261890ndash1902
Wang B Goode J Best J Meltzer J Schilman PE Chen J Garza D ThomasJB Montminy M 2008 The insulin-regulated CREB coactivatorTORC promotes stress resistance in Drosophila Cell Metab7434ndash444
Wang J Kean L Yang J Allan AK Davies SA Herzyk P Dow JA 2004Function-informed transcriptome analysis of Drosophila renaltubule Genome Biol 5R69
Wang K Li M Hakonarson H 2010 Analysing biological pathways ingenome-wide association studies Nat Rev Genet 11843ndash854
Wang L Jia P Wolfinger RD Chen X Zhao Z 2011 Gene set analysis ofgenome-wide association studies methodological issues and per-spectives Genomics 981ndash8
Weber AL Khan GF Magwire MM Tabor CL Mackay TF Anholt RR2012 Genome-wide association analysis of oxidative stress resistancein Drosophila melanogaster PLoS One 7e34745
Weeks AR McKechnie SW Hoffmann AA 2002 Dissecting adaptiveclinal variation markers inversions and sizestress associations inDrosophila melanogaster from a central field population Ecol Lett5756ndash763
Zhu Y Bergland AO Gonzalez J Petrov DA 2012 Empirical validation ofpooled whole genome population re-sequencing in Drosophila mel-anogaster PLoS One 7e41901
15
Liu Y Luo J Carlsson MA Neuroassel DR 2015 Serotonin and insulin-likepeptides modulate leucokinin-producing neurons that affectfeeding and water homeostasis in Drosophila J Comp Neurol5231840ndash1863
Lohse M Bolger AM Nagel A Fernie AR Lunn JE Stitt M Usadel B 2012RobiNA a user-friendly integrated software solution for RNA-Seq-based transcriptomics Nucleic Acids Res 40W622ndashW627
Long A Liti G Luptak A Tenaillon O 2015 Elucidating the moleculararchitecture of adaptation via evolve and resequence experimentsNat Rev Genet 16567ndash582
Lynch M Bost D Wilson S Maruki T Harrison S 2014 Population-genetic inference from pooled-sequencing data Genome Biol Evol61210ndash1218
Mackay TFC Stone EA Ayroles JF 2009 The genetics of quantitativetraits challenges and prospects Nat Rev Genet 10565ndash577
Marron MT Markow TA Kain KJ Gibbs AG 2003 Effects of starvationand desiccation on energy metabolism in desert and mesicDrosophila J Insect Physiol 49261ndash270
Matzkin LM Markow TA 2009 Transcriptional regulation of metabo-lism associated with the increased desiccation resistance of thecactophilic Drosophila mojavensis Genetics 1821279ndash1288
Murali T Pacifico S Yu J Guest S Roberts GG Finley RL Jr 2011 DroID2011 a comprehensive integrated resource for protein transcrip-tion factor RNA and gene interactions for Drosophila Nucleic AcidsRes 39D736ndashD743
Nuzhdin SV Turner TL 2013 Promises and limitations of hitchhikingmapping Curr Opin Genet Dev 23694ndash699
Orozco-terWengel P Kapun M Nolte V Kofler R Flatt T Schleurootterer C2012 Adaptation of Drosophila to a novel laboratory environmentreveals temporally heterogeneous trajectories of selected alleles MolEcol 214931ndash4941
Picard Picard A set of Java command line tools for manipulating high-throughput seqeuncing (HTS) data and formats Available fromhttpbroadinstitutegithubiopicard
Qiu Y Tittiger C Wicker-Thomas C Le Goff G Young S Wajnberg EFricaux T Taquet N Blomquist GJ Feyereisen R 2012 An insect-specific P450 oxidative decarbonylase for cuticular hydrocarbon bio-synthesis Proc Natl Acad Sci U S A 10914858ndash14863
Rajpurohit S Oliveira CC Etges WJ Gibbs AG 2013 Functional genomicand phenotypic responses to desiccation in natural populations of adesert Drosophilid Mol Ecol 222698ndash2715
Sarup P Soslashrensen JG Kristensen TN Hoffmann AA Loeschcke V PaigeKN Soslashrensen P 2011 Candidate genes detected in transcriptomestudies are strongly dependent on genetic background PLoS One6e15644
Schiffer M Hangartner S Hoffmann AA 2013 Assessing the relativeimportance of environmental effects carry-over effects and speciesdifferences in thermal stress resistance a comparison ofDrosophilids across field and laboratory generations J Exp Biol2163790ndash3798
Schleurootterer C Kofler R Versace E Tobler R Franssen SU 2015Combining experimental evolution with next-generation sequen-cing A powerful tool to study adaptation from standing geneticvariation Heredity 114431ndash440
Sinclair BJ Gibbs AG Roberts SP 2007 Gene transcription during ex-posure to and recovery from cold and desiccation stress inDrosophila melanogaster Insect Mol Biol 16435ndash443
Sloggett C Wakefield M Philip G Pope B 2014 Rubramdashflexible distrib-uted pipelines figshare Available from httpdxdoiorg106084m9figshare895626
Smit AFA Hubley R Green P 2013-2015 RepeatMasker Open-403Available from httpwwwrepeatmaskerorg
Seurooderberg JA Birse RT Neuroassel DR 2011 Insulin production and signalingin renal tubules of Drosophila is under control of tachykinin-relatedpeptide and regulates stress resistance PLoS One 6e19866
Soslashrensen JG Nielsen MM Loeschcke V 2007 Gene expression profileanalysis of Drosophila melanogaster selected for resistance to envi-ronmental stressors J Evol Biol 201624ndash1636
St Pierre SE Ponting L Stefancsik R McQuilton P 2014 FlyBase102mdashadvanced approaches to interrogating FlyBase Nucleic AcidsRes 42D780ndashD788
Stevenson M Nunes T Heuser C Marshall J Sanchez J Thornton RReiczigel J Robison-Cox J Sebastiani P Solymos P et al 2015 epiRTools for the analysis of epidemiological data R package version09-62 Available from httpCRANR-projectorgpackagefrac14epiR
Storey JD Tibshirani R 2003 Statistical significance for genomewidestudies Proc Natl Acad Sci U S A 1009440ndash9445
Telonis-Scott M Gane M DeGaris S Sgro CM Hoffmann AA 2012 Highresolution mapping of candidate alleles for desiccation resistancein Drosophila melanogaster under selection Mol Biol Evol291335ndash1351
Telonis-Scott M Guthridge KM Hoffmann AA 2006 A new set oflaboratory-selected Drosophila melanogaster lines for the analysisof desiccation resistance response to selection physiology and cor-related responses J Exp Biol 2091837ndash1847
Terhzaz S Cabrero P Robben JH Radford JC Hudson BD Milligan GDow JA Davies SA 2012 Mechanism and function of Drosophilacapa GPCR a desiccation stress-responsive receptor with functionalhomology to human neuromedinU receptor PLoS One 7e29897
Terhzaz S Overend G Sebastian S Dow JA Davies SA 2014 The Dmelanogaster capa-1 neuropeptide activates renal NF-kB signalingPeptides 53218ndash224
Terhzaz S Teets NM Cabrero P Henderson L Ritchie MG Nachman RJDow JA Denlinger DL Davies SA 2015 Insect capa neuropeptidesimpact desiccation and cold tolerance Proc Natl Acad Sci U S A1122882ndash2887
Thorat LJ Gaikwad SM Nath BB 2012 Trehalose as an indicator ofdesiccation stress in Drosophila melanogaster larvae a potentialmarker of anhydrobiosis Biochem Biophys Res Commun419638ndash642
Tobler R Franssen SU Kofler R Orozco-terWengel P Nolte VHermisson J Schl eurootterer C 2014 Massive habitat-specific ge-nomic response in D melanogaster populations during exper-imental evolution in hot and cold environments Mol Biol Evol31364ndash375
Turner TL Bourne EC Von Wettberg EJ Hu TT Nuzhdin SV 2010Population resequencing reveals local adaptation of Arabidopsislyrata to serpentine soils Nat Genet 42260ndash263
Turner TL Miller PM 2012 Investigating natural variation in Drosophilacourtship song by the evolve and resequence approach Genetics191633ndash642
Vermeulen CJ Soslashrensen P Kirilova Gagalova K Loeschcke V 2013Transcriptomic analysis of inbreeding depression in cold-sensitiveDrosophila melanogaster shows upregulation of the immune re-sponse J Evol Biol 261890ndash1902
Wang B Goode J Best J Meltzer J Schilman PE Chen J Garza D ThomasJB Montminy M 2008 The insulin-regulated CREB coactivatorTORC promotes stress resistance in Drosophila Cell Metab7434ndash444
Wang J Kean L Yang J Allan AK Davies SA Herzyk P Dow JA 2004Function-informed transcriptome analysis of Drosophila renaltubule Genome Biol 5R69
Wang K Li M Hakonarson H 2010 Analysing biological pathways ingenome-wide association studies Nat Rev Genet 11843ndash854
Wang L Jia P Wolfinger RD Chen X Zhao Z 2011 Gene set analysis ofgenome-wide association studies methodological issues and per-spectives Genomics 981ndash8
Weber AL Khan GF Magwire MM Tabor CL Mackay TF Anholt RR2012 Genome-wide association analysis of oxidative stress resistancein Drosophila melanogaster PLoS One 7e34745
Weeks AR McKechnie SW Hoffmann AA 2002 Dissecting adaptiveclinal variation markers inversions and sizestress associations inDrosophila melanogaster from a central field population Ecol Lett5756ndash763
Zhu Y Bergland AO Gonzalez J Petrov DA 2012 Empirical validation ofpooled whole genome population re-sequencing in Drosophila mel-anogaster PLoS One 7e41901
15
Recommended