Upload
vuongdang
View
215
Download
0
Embed Size (px)
Citation preview
1
AssessingthepotentialofRAD-sequencingtoresolve1
phylogeneticrelationshipswithinspeciesradiations:the2
flygenusChiastocheta(Diptera:Anthomyiidae)asacase3
study4
5
TomaszSuchan1,2,AnahíEspíndola3*,SereinaRutschmann1,4*,BrentC.Emerson5,6,Kevin6
Gori7,ChristopheDessimoz1,8,9,10,NilsArrigo1,MichałRonikier2,NadirAlvarez1 7
8
1DepartmentofEcologyandEvolution,UniversityofLausanne,Switzerland9 2W.SzaferInstituteofBotany,PolishAcademyofSciences,Kraków,Poland10 3DepartmentofBiologicalSciences,UniversityofIdaho,Moscow,ID,USA11 4DepartmentofBiochemistry,GeneticsandImmunology,UniversityofVigo,Spain12 5IslandEcologyandEvolutionResearchGroup,InstitutodeProductosNaturalesy13
Agrobiología(IPNA-CSIC),LaLaguna,Tenerife,CanaryIslands,Spain14 6SchoolofBiologicalSciences,UniversityofEastAnglia,Norwich,UK15 7DepartmentofVeterinaryMedicine,UniversityofCambridge,UK16 8CenterforIntegrativeGenomics,UniversityofLausanne,Switzerland17 9DepartmentofGenetics,Evolution&EnvironmentandDepartmentofComputer18
Science,UniversityCollegeLondon,UK19 10SwissInstituteofBioinformatics,Lausanne,Switzerland20
21 *theseauthorsareconsideredassecondco-authors22
23
correspondingauthors:24
TomaszSuchan([email protected]),W.SzaferInstituteofBotany,PolishAcademyof25
Sciences,Lubicz46,31-512Kraków,Poland26
NadirAlvarez([email protected]),DepartmentofEcologyandEvolution,University27
ofLausanne,BiophoreBuilding,1015Lausanne,Switzerland;fax:+41216924165 28
2
Abstract29
Determiningphylogeneticrelationshipsamongrecentlydivergedspecieshaslongbeen30
achallengeinevolutionarybiology.Cytoplasmicmarkers,whichhavebeenwidelyused31
notablyinthecontextofmolecularbarcoding,havenotalwaysprovedsuccessfulin32
resolvingsuchphylogenies,butphylogeniesforcloselyrelatedspecieshavebeen33
resolvedatamuchhigherdetailinthelastcoupleofyearswiththeadventofnext-34
generation-sequencingtechnologiesandassociatedtechniquesofreducedgenome35
representation.Hereweexaminethepotentialandlimitationsofoneofsuchtechniques36
—Restriction-siteAssociatedDNA(RAD)sequencing,amethodthatproduces37
thousandsof(mostly)anonymousnuclearmarkers,indisentanglingthephylogenyof38
theflygenusChiastocheta(Diptera:Anthomyiidae).Thisgenusencompassesseven39
describedspeciesofseedpredators,whichhavebeenwidelystudiedinthecontextof40
theirecologicalandevolutionaryinteractionswiththeplantTrolliuseuropaeus41
(Ranunculaceae).Sofar,phylogeneticanalysesusingmitochondrialmarkersfailedto42
resolvemonophylyofmostofthespeciesfromthisrecentlydiversifiedgenus,43
suggestingthattheirtaxonomymayneedtoberevised.However,relyingonasingle,44
non-recombiningmoleculeandignoringpotentialincongruencesbetween45
mitochondrialandnuclearlocimayprovideincompleteaccountofalineagehistory.In46
thisstudy,weapplybothclassicalSangersequencingofthreemtDNAregionsandRAD-47
sequencing,forreconstructingthephylogenyofthegenus.Contrastingwithresults48
basedonmitochondrialmarkers,RAD-sequencinganalysesretrievedthemonophylyof49
allsevenspecies,inagreementwiththemorphologicalspeciesassignment.Wefound50
robustnuclear-basedspeciesassignmentofindividualsamples,andlowlevelsof51
estimatedcontemporarygeneflowamongthem.However,despiterecoveringspecies’52
3
monophyly,interspecificrelationshipsvarieddependingonthesetofRADloci53
considered,producingcontradictorytopologies.Moreover,coalescence-based54
phylogeneticanalysesrevealedlowsupportsformostoftheinterspecificrelationships.55
OurresultsindicatethatdespitethehigherperformanceofRAD-sequencingintermsof56
speciestreesresolutioncomparedtocytoplasmicmarkers,reconstructinginter-specific57
relationshipsmayliebeyondthepossibilitiesofferedbylargesetsofRAD-sequencing58
markersincasesofstronggenetreeincongruence.59
60
Keywords:coalescentanalysis;DNAbarcoding;maximumlikelihood;mito-nuclear61
incongruence;singlenucleotidepolymorphisms;quartetinference62
63
4
1.Introduction64
Recentlydivergedlineagesposeaproblemfortraditionalphylogeneticapproachesthat65
typicallyrelyonasmallsetofrelativelyslowlyevolvingloci(DeFilippis2000),often66
lackingresolutionatnarrowerevolutionaryscales(Cariouetal.2013).Inaddition,67
complexprocessessuchasincompletelineagesorting(Aviseetal.2008;Maddison&68
Knowles2006;Pollardetal.2006)andgeneflowamongspecies(Leachéetal.2013)69
increaseincongruencesamonggenetreesandtopologicaldeviationsfromthespecies70
tree(Dengan&Rosenberg2009;Maddison1997).Thisisespeciallytrueforlineages71
thathaveundergonerapidradiations,inwhichancestralpolymorphismssorted72
idiosyncraticallyintothedescendanttaxathroughshortevolutionarynodes(Aviseetal.73
2008),andincaseswheresubsequentevolutionaryeventsmayblurphylogeneticsignal74
(Whitfield&Kjer2008;Whitfield&Lockhart2007).Samplingmorelocihasbeen75
showntobeapromisingapproachinsuchcases(Rokas&Carroll2005;Townsendetal.76
2011;Wielstraetal.2014;Williamsetal.2013),butthespectrumofgeneticmarkers77
developedforphylogenyestimationisstilllimited(Whitfield&Kjer2008).78
Next-generationsequencingapproaches,particularlyreducedrepresentationgenome79
sequencing(Daveyetal.2011),offerthepossibilitytosamplethousandsofgenomic80
markersfromnon-modelspecies.Amongthem,Restrictionsite-AssociatedDNA(RAD;81
Bairdetal.2008)techniquesrelyonthesequencingofshortDNAfragmentsflanking82
restrictionsites,generatingrandomanonymousgenomicmarkers,homologousacross83
theanalyzedsamples(Andrewsetal.2016;Davey&Blaxter2010).Fromaphylogenetic84
perspective,animportantaspectofRADmarkersistheriseintheproportionof‘null85
alleles’asgenomedivergenceacrosssamplesincreases.Thisphenomenoniscausedby86
randommutationsoccurringintherestrictionsitesthatdecreasethenumbersof87
5
sharedRADlociamongtaxa,resultingindatamatricescontaininglargeamountsof88
missingdata(Cariouetal.2013;Chattopadhyayetal.2014;Gautieretal.2013).89
However,usinganin-silicoapproachRubinetal.(2012)andCariouetal.(2013)have90
shownthatRAD-seqdatacanbeusedsuccessfullytoresolvespeciesrelationshipsthat91
transcendtimescalesupto60Mya(millionyearsago).ExperimentallysampledRAD92
datasetshavebeenappliedtoreconstructphylogeneticrelationships,mostlyamong93
recentlydivergedtaxa(e.g.,Eaton&Ree2013;Harveyetal.2016;Jonesetal.2013;94
Leachéetal.2015;Nadeauetal.2013;Wagneretal.2013),withfewerstudiesinvolving95
moredistantlyrelatedones,evenuptoeven80Mya(e.g.,Cruaudetal.2014;Eatonet96
al.2016;HerreraandShank2016;Hippetal.2014;Panteetal.2015).Althoughthese97
genomicdatasetsimprovedphylogeneticinferencesforgroupsthatwereambiguous98
usingclassicalmarkers(e.g.,Escuderoetal.2014;Hippetal.2014),thepotentialutility99
ofRADlociforresolvingmorecomplexphylogenetichistories,suchasthosewhere100
historicalintrogressionhasoccurredorthoseassociatedwithincompletelineage101
sorting,remainsstillpoorlyexplored(Combosch&Vollmer2015;Eaton&Ree2013).102
Moreover,theuseofRADdatasetsasmarkersforevolutionarygeneticshasrecently103
beenheavilydiscussed(Lowryetal.2017;McKinneyetal.2017).104
Inthisstudy,wetesttheutilityofRAD-sequencingtorecoverphylogenetic105
relationshipsinagenusofseedparasiticpollinatorsofTrolliuseuropaeus106
(Ranunculaceae)—fliesfromthegenusChiastochetaPokorny,1889(Diptera:107
Anthomyiidae).Here,sequencingofmitochondrialmarkersfailedtorevealthe108
monophylyandphylogeneticrelationshipsamongpreviouslymorphologically109
describedspecies(Desprésetal.2002;Espíndolaetal.2012).Thisdiscordance110
betweenmorphologyandmitochondrialphylogenyhasbeeninterpretedasacallfora111
6
taxonomicrevision,andapossiblereconsiderationofconclusionsfromprevious112
ecologicalandevolutionarystudies(Espíndolaetal.2012).However,several113
mechanismsmaycausemitochondrianottotrackspeciesevolution(Funk&Omland114
2003)and,indeed,therearemanycaseswheremitochondrialandnucleargenetrees115
havebeenshowntobeincongruent(e.g.,Govindarajuluetal.2015;Phillipsetal.2013;116
Seehausenetal.2003).Asrelyingonasingle,non-recombiningmoleculemayprovidea117
misleadingaccountofaspecieshistory(Ballard&Whitlock2004),utilizingalargeset118
ofindependentnuclearloci(sampledthroughRAD-sequencing)shouldallowustotest119
themonophylyofthemorphologicallydescribedspeciesandresolvephylogenetic120
relationshipsamongthem.Whetherornotmolecularmarkersareabletoreveal121
scenariosofrapidradiationsisstillanopenquestion(Giarla&Esselstyn2015).In122
these,identifyingasinglespeciestreemightliebeyondanalyticalpossibilitiesdueto123
pervasiveconflictsamongthegenetrees,particularlywhenpopulationsizesarelarge124
andspeciationeventshappenatahigherratethanthemutation-driftequilibrium,125
eventuallyproducingconflictingtopologies.Inordertoexploregeneandspeciestrees,126
weappliedbothaconcatenation-basedphylogeneticapproach(i.e.,RAxML;Stamatakis127
2014)andacoalescence-basedinferencemethod(i.e.,SVDquartets;Chifman&Kubatko128
2014)toaRAD-seqdatasetencompassingspecimensfrom51Europeanpopulations,129
representativeofthesevenrecognizedChiastochetamorphospecies.Inorderto130
examinetheextenttowhichdifferentcombinationsofRADlocimayproducedistinct131
speciestrees,weusedanewlydevelopedalgorithmthatperformslocibinning,using132
dissimilaritylevelsamongphylogeneticpatternsretrievedatsingleloci(treeCl;Goriet133
al.2016).Wealsoappliedpopulationgeneticsclusteringalgorithms(i.e.,STRUCTURE;134
Pritchardetal.2000)asacontrol.Eventually,wecomparedourresultstothose135
7
obtainedwithclassicalphylogeneticinferencebasedonconcatenationofthree136
mitochondrialregions.137
2.Materialsandmethods138
2.1Studysystem139
ThecenteroforiginanddiversityofChiastochetahasbeeninferredtobetheWestern140
Palearctic,wheresevenflyspeciesareinvolvedinnurserypollinationinteractionswith141
TrolliuseuropaeusL.(Espíndolaetal.2012;Pellmyr1989,1992;Suchanetal.2015).142
ThesesevenmorphologicallydelimitedEuropeanChiastochetaspecies,namelyC.143
dentiferaHennig1953;C.inermella(Zetterstedt,1838);C.lophotaKarl,1943;C.144
macropygaHennig,1953;C.rotundiventrisHennig,1953;C.setiferaHennig,1953andC.145
trollii(Zetterstedt,1838)areecologicallyverysimilarandoftensympatric(Collin1954;146
Hennig1976;Michelsen1985;Zetterstedt1845;V.Michelsenpers.comm.).Inhis147
monographofthisplant-pollinatorinteraction,Pellmyr(1992)discussedanother148
species,C.abruptiventrisasanorthernvicariantofC.rotundiventris,ataxonnot149
supportedbypreviousmolecularstudies(Espíndolaetal.2012)andneverformally150
described.AlthoughallChiastochetareproducewithintheflowersofT.europaeus,with151
potentialcross-speciesmatingpossibilities,noputativehybridshavebeenobserved152
basedongenitalmorphology(T.SuchanandA.Espíndola,pers.obs.).153
Althoughthespeciesarewelldefinedmorphologically,mitochondrialphylogenies154
recoveredonlythreemonophyleticclades–C.rotundiventris,C.dentifera,andC.lophota155
(Desprésetal.2002;Espíndolaetal.2012),andsuggestedapolyphyleticoriginforC.156
inermellaandC.setifera(Desprésetal.2002;Espíndolaetal.2012),withC.macropyga157
andC.trolliibeingparaphyletic(Espíndolaetal.2012).Moleculardatingplacedthe158
8
mostrecentcommonancestorofallEuropeanspeciesattheendofthePliocene(2-3.4159
Ma;Desprésetal.2002;Espíndolaetal.2012),andindicatedthatmostdiversification160
eventsoccurredwithinthelast1.6Ma.161
2.2Sampling162
Chiastochetaspecimensweresampledfrom51Europeanpopulationsduringspringand163
summer2006,2007,and2008(Table1;mapsonFig.S1).Theflieswerekilledand164
preservedin70%ethanolandstoredatroomtemperatureuntilDNAextraction.165
CollectedspecimenswereidentifiedtomorphospeciesfollowingHennig(1976)and166
unpublishedkeys(V.Michelsen).Allidentificationswereconfirmedbyanexpert(V.167
Michelsen,NaturalHistoryMuseumofDenmark,Copenhagen),asthetaxonomical168
revisionofthegenusisnotyetpublished.169
2.3SequencingmitochondrialregionsandRADmarkers170
DNAwasextractedfrominsectlegsusingaDNeasyBloodandTissueKit(Qiagen,171
Hilden,Germany),followingthemanufacturer’sinstructions.Weamplifiedthree172
mitochondrialregions:COI,COII,andtheultra-variableD-loop(control)region.We173
followedEspíndolaetal.(2012)forsequencingoftheCOIandCOIIregions.ForD-loop174
weusedprimersTM-N-193andSR-J-14612asdesxcribedinSimonetal.(1994)as175
describedbyEspíndolaetal.(2012)withthefollowingmodificationofthePCR176
program:5minat95°C,followedby35cyclesof1minat95°C,1minofannealing177
at55°Cand2minofelongationat60°C,and5minoffinalelongationat60°C.PCR178
productsweresequencedatMacrogenInc.(SouthKorea)andFasterisSA(Switzerland).179
ChromatogramswerevisuallycorrectedonChromasPro1.41(TechnelysiumPty.Ltd.).180
AlignmentwasperformedusingMUSCLEalgorithm(Edgar2004)inGeneious10.1.3181
9
(Biomatters,Auckland,NewZeland)andgapswithmorethan50%missingdatainthe182
D-loopregionwereremoved.Additionally,adatasetwithD-loopremovedwas183
analyzed.DoubledigestRAD(ddRAD)librarieswerepreparedaccordingtoMastrettaet184
al.(2014),amodifiedprotocolofPetersonetal.(2012),withoutperformingthesize-185
selectionofDNAfragments,andotherminormodifications(seeSupporting186
Information).TheenzymesusedforDNAdigestionwereSbfIandMseI.Librarieswere187
sequencedattheLausanneGenomicTechnologiesFacility(Switzerland)onthreelanes188
oftheHiSeq2500instrument(Illumina,SanDiego,USA)usinga2x100bppaired-end189
readsprotocol.ForRAD-sequencingweintroducedtechnicalreplicatesforoptimizing190
denovoassemblyandcontrolsfortheeffectsofsequencingerrorsandalleledropouton191
thefinalresults(Mastretta-Yanesetal.2015;sampleswith“REPL”suffixinTableS1),192
andDNAextractionreplicatesfromtheflythoracicmuscle(inordertocontrolforflies’193
bodycontaminationwithpollen;sampleswith“MUS”suffixinTableS1).194
2.4RAD-seqlociassembly195
TwoimportantconsiderationsfordenovoRADlociassemblyaretheparametersfor196
clusteringorthologousloci,whilefilteringoutparalogs(Eaton2014;Mastrettaetal.197
2015).Ifthesequencesimilarityrequiredtoconsidersequenceasorthologsissettoo198
high,realheterozygousalleleswillbesplitintomorethanonecluster,therefore199
creatingfalsehomozygousloci(Harveyetal.2015).Ontheotherhand,ifthesimilarity200
issettoolow,thiswillresultinparalogoussequencesbeingclusteredtogether.Several201
methodswereproposedforfilteringoutsuchsequencesfromthefinaldataset,202
includingploidyfiltering(removingclustersthathavemorethantwosequencesper203
individual)andfilteringouthighlyvariableloci(Eaton2014;Ilutetal.2014).Asthere204
10
arenogeneralguidelinesforfine-tuningtheparametersmentionedabove,we205
empiricallytestedhowthedifferentclusteringparametersaffectedthefinaldataset.206
Finally,wechosethedatasetwithclusteringparameterthatmaximizedthelocioverlap207
betweenpairsoftechnicalreplicates(seebelow).Locioverlapamongsamplesandpairs208
oftechnicalreplicateswerecalculatedusingtheRADami1.0libraryinR(Hippetal.209
2014).210
ReaddemultiplexinganddenovoassemblyofRADlociwasperformedusingthepyRAD211
3.0.1package(Eaton2014),basedonanalignment-clusteringalgorithm.Thisapproach212
allowsforindelvariationamongmoredivergedspecimens.Moreover,italsoallowsfor213
lowersimilarityamongtheclusteredreads,makingitwell-suitedforphylogenetic-scale214
analyses.First,thereadsweredemultiplexedaccordingtothein-line6-nucleotides215
barcodepresentatthebeginningofeachsequencedfragment,whileallowingforone216
mismatch.Onlyreadswiththerestrictionsitepresentwereretainedforfurther217
analyses.AllnucleotideswithPhredqualityscorelowerthan20wereconvertedto218
unknownbasesandreadswithmorethanfourunknownsiteswereremovedfromthe219
dataset.Readswerethenclusteredwithinandbetweenindividuals,withaminimum220
numberofsixreadstoformaclusterandsequencesimilarityof75%,80%,85%,90%,221
and95%.Possibleclusteredparalogsorrepetitivesequenceswereremovedbyfiltering222
outthelocithathadmorethanfivevariablepositionsperlocusormorethan10shared223
polymorphicsitesinalocusamongindividuals,andthelociforwhichmorethantwo224
alleleswerepresentperindividual.Finally,datasetswereproducedbyretainingtheloci225
presentinaminimumof10,20,and100individualsandcomparedforthetotalnumber226
ofloci,proportionofmissingdata,locioverlapamongreplicates,andthemeannumber227
ofindividualsperlocus.228
11
2.5Phylogeneticanalyses229
WeperformedMaximum-likelihood(ML)analysesusingRAxML(Stamatakis2014)230
withrapidbootstrapanalysesandextendedmajority-ruleconsensustreeautomatic231
bootstrapstoppingcriterion,followingsearchforthebest-scoringMLtree.The232
mitochondrialregionswerepartitionedusingthePartitionFinder2.1.1software233
(Lanefaretal.2016).AnalyseswereperformedintheRAxML8.2.4,forthe234
mitochondrialdataset.ForthedatasetconsistingofCOI,COII,andD-loopregionsthe235
GTR+G+Imodelwithallthreenucleotidepositionsoncodinggenesconsideredas236
separatepartitionsandD-loopasafourthpartition.ForthedatasetconsistingofCOI237
andCOIItheGTR+Gmodelwiththefirsttwonucleotidepositionsconsideredasafirst238
andthirdnucleotidepositionconsideredasasecondpartition.AnalysesfortheRAD239
datasetwereperformedusingtheGTRCATmodelintheRAxML8.2.10versiononthe240
CIPREScluster(SanDiegoCA,USA;Milleretal.2010).FortheRAD-baseddataset,241
replicatedsampleswereretainedinthephylogeneticanalysesandtheconcatenated242
matrixwasconsideredasasinglepartition.Additionally,MLanalysesoftheRAD243
datasetswithotherclusteringparameterswereperformedinordertoevaluatehowthis244
parameteraffectsthetreetopology.ThetreeswererootedwithC.rotundiventrisasan245
outgroup,asidentifiedpreviously(Desprésetal.2002;Espíndolaetal.2012).246
Toaccountfortheeffectsofincongruenceamongnuclearlociontheinferred247
phylogenies—forinstanceresultingfromincompletelineagesorting,weappliedthe248
methodofGorietal.(2016)implementedinthetreeClpackage(http://git.io/treeCl).249
BecausethemajorityofRADlocihadsparsecoverageovertheindividuals,wekeptonly250
locipresentinmorethan100individualsforthispartofanalysis.TheML251
phylogenenieswerefirstcalculatedforeverylocususingtheGTR+Gmodelas252
12
implementedinRAxML8.1.11(Stamatakis2014).Then,pairwisegeodesicdistances253
betweenalltheresultingsingle-locusphylogeniesweremeasured,andthetreeswere254
groupedbasedonthedistancematrixusingspectralclustering(aprotocolhereafter255
referredtoasbinning).Thenumberofbinswasestimatedusingthenonparametric256
bootstrappingstoppingcriterion.Supportforeachbranchineachtopologywas257
calculatedusingaBayesinPhyML(Anisimovaetal.2011).Wealsoanalyzedthelog-258
likelihoodimprovementwhenanalyzingthedatawithn+1splitsvs.nsplits,compared259
tothenullexpectation(i.e.randomlociclustering).260
Additionally,weappliedacoalescent-basedinferencemethodusingSVDquartets261
(ChifmanandKubatko2014)asimplementedinPAUP*v.4a150andv.4a151(Swofford262
2002).Thismethodinfersthetopologyamongrandomlysampledquartetsusinga263
coalescentmodel,andassemblestherandomlysampledquartetsusingaquartet264
amalgamationmethod.Breakingthesequenceintoquartetsmakestheanalysisoflarge265
numbersoflocifeasible.Werandomlysampledthemaximumofallpossiblequartets266
(i.e.48,603,900quartets=200taxa)withthemultispeciescoalescentoptionand1,000267
bootstrapreplicates.ThequartetsweresummarizedwiththeQFM(Reazetal.2014)268
quartetamalgamationprogramasimplementedinPAUP*.Phylogenetictreeswere269
visualizedusingtheape3.2Rpackage(Paradisetal.2004).270
2.6Populationstructure271
Weinferredpopulationstructureusingtheadmixturemodelimplementedin272
STRUCTURE2.3.4(Pritchardetal.2000),withoutpriorpopulationassignmentandwith273
allelefrequenciescorrelatedamongpopulations.ThesoftwareusesaBayesian274
frameworktoestimatethelikelihoodofthedatagivenanumberofaprioridefinedK275
13
populationclusters,outputtingthelikelihoodofeachsampletobelongtoeachpossible276
cluster.Thisanalysiswasperformedafterremovingtechnicalreplicatesfromthe277
dataset,retainingonlythelocipresentinaminimumof20individuals,andselecting278
onerandomsingleSNPfromeachlocus.AnalyseswererunforKvaluesranging279
between1and8,withaburn-inof200Kcycles,followedby1Mcyclesofsampling,with280
3replicatesforeachKvalue.TheoptimalKvaluewasidentifiedfollowingEvannoetal.281
(2005),asimplementedinSTRUCTUREHARVESTER(Earl&vonHoldt2012).To282
accountforthephylogeneticcomponentinthemissingalleles,weranSTRUCTUREwith283
therecessiveallelesmodel,withmissingdatacodedasrecessive.284
3.Results285
3.1Chiastochetasampling286
Weanalyzedatotalof272ChiastochetaspecimenssampledfromtheentireEuropean287
rangeofthegenus(seeTable1andFig.S1formapsofthesampledspecimens).Most288
speciesdisplayedabroadspatialdistribution.Uptosixspeciescouldbefoundinone289
singlelocalityduringasinglevisit(Table1,mean=2.7speciesperlocality,SD=1.5),290
confirmingthesympatricnatureofthespeciesandtheexistingopportunitiesfor291
hybridization.292
3.2SequencingandRADlociassemblyresults293
Afterinitialscreening,21sampleswereremovedfromthefinaldatasetbecauseof294
insufficientcoverageortechnicalerrors.Wesuccessfullyanalyzed260specimensfor295
themitochondrialdataset(255forCOI,204forCOII,141forthefirst,and152forthe296
secondfragmentoftheD-loop),and263fortheRADdataset,while251sampleswere297
14
sharedbetweenthedatasets(Table1).FortheRADdataset,wealsosequenced22298
technicalreplicates(sampleswith“REPL”suffixinTableS1),and11DNAextraction299
replicatesfromtheflymusclevs.extractionsfromlegs(sampleswith“MUS”suffixin300
TableS1).301
Sequencingofthemitochondrialregionsyielded1132nucleotidepositionsforthe302
COI+COIIdataset(ofwhich120werevariable)and2003fortheCOI+COII+D-loop303
dataset(ofwhich334werevariable),afteralignmentandgapfiltering.Threerunsof304
RADsequencingoutput552’425’482of2x100bpreads,fromwhich340’598’636305
(62%)passedtherestrictionsiteandbarcodequalityfilters(TableS1).306
Aftercomparingthenumberofloci,coverage,andoverlapoflociamongreplicatesin307
theobtaineddatasets(Fig.S2),wechosethedatasetwithaminimumof75%sequence308
similarityrequiredforthesequencestoclusterinalocusandaminimumof20309
individualsperlocusforthemainanalyses.Thisdatasetcontained1724lociafter310
filteringandparalogremoval,with82’782variablesites.Theproportionofmissingdata311
inthedatasetwas0.84,withastrongphylogeneticcomponentinthedistributionof312
missingloci(Fig.S3a).AftersamplingoneSNPperlocusfortheSTRUCTUREanalysis,313
weobtained1669SNPs,ofwhich159werebi-allelic.314
ForthedatasetusedforassessinglociincongruenceintheRAD-seqbasedphylogeny315
(seebelow),wefocusedonlocipresentinatleast100individuals.Thisresultedina316
matrixof176loci(among1724overallnumberoflociidentified;i.e.,10.2%)with317
missingdatashowingmuchlessphylogeneticstructuring(Fig.S3b).318
15
3.3Mitochondrialandnuclear-dataphylogenies319
ThemitochondrialphylogenyontheCOI+COII_D-loopdataset(Fig.1aandFig.S4a)320
failedtoresolvefourofthecladesidentifiedbasedontheRAD-seqdata(seebelow),but321
retrievedwell-supportedmonophyleticgroupforC.rotundiventrisand,tothelesser322
extentforC.dentiferaandC.inermella,asbothofthelatterhadtwospecimensplaced323
outsidetheirclades.C.inermella,C.setifera,andC.trolliiformedonecladewiththe324
speciesextensivelyinterdispersedandacladecontainingmostlyC.macropyganested325
within.AstheanalysisbasedbasedonthereducedCOI+COIIdatasetrecoveredasimilar326
pattern,exceptplacingC.lophotaassistertoC.macropyga,werefertotheresultsofthe327
largerdatasetintherestofthepaper.MostoftheC.lophotasamplesalsoformedone328
cladewithlowersupportvalues.Therelationshipsamongsamplesfromtheremaining329
fourmorphospeciesremainedunresolved,withoutclearsupportforthe330
morphologicallydescribedspecies.331
IncontrasttotheMLmtDNAphylogeny,bothMLandSVDquartetsanalysisoftheRAD332
analysis(Fig.1b,candFig.S4b,c)confirmedmonophylyofthesevenmorphologically333
definedtaxa.RAxMLanalysisrevealedrelativelyhighbootstrapsupports(>90%)for334
alloftheinterspecificrelationships,exceptthesplitbetweenC.setiferaandtheclade(C.335
lophota,(C.macropyga,(C.dentifera,C.trollii)))withbootstrapsupport>80%.Thesplit336
ofC.rotundiventrisintotwoputativevicariantclades,informallyproposedbyPellmyr337
(1992)—northernC.abruptriventrisandsouthernC.rotundiventris,wasnotrecovered.338
SVDquartetsanalysisalsoconfirmedmonophylyofthespecies,butonlythesplit339
betweenC.dentiferaandC.trolliihadabootstrapsupport>90%;theclade(C.340
macropyga,(C.dentifera,C.trollii))hadbootstrapsupport>80%;thesetwocladeswere341
theonlyonessupportedbybothSVDquartetsandtheRAxMLanalyses(Fig.1c).342
16
Moreover,SVDquartetsrevealedtwowell-supportedcladeswithinC.rotundiventris.343
Thesehoweverdonotshowanypatternofvicarianceandoftenoccurtogetherina344
singlepopulation,thusmostlikelydonotcorrespondtothetwovicariantspeciesofC.345
abruptiventrisandC.rotundiventrisasdiscussedbyPellmyr(1992).Thetechnical346
replicateswereconsistentintheplacementofthesamplewithintheproperclade,and347
mostreplicateswereplacedassistercladeswithbothmethods(Fig.S4b,c).348
3.4IncongruenceamongtheRAD-sequencingloci349
TreeClanalysisidentified,inthemostconservativeinterpretation,atleastfourclusters350
ofloci,asthelargestlikelihoodimprovementwasobtainedwhenincreasingthenumber351
ofbinsfromthreetofour(Fig.S6).Thebinsizeswereof29,42,47,and58loci,352
thereforetheidentifiedgroupswerenotsimplyconsistingofafewoutliers.Thetrees353
inferredforthefourbinsconfirmedthemonophylyoftheanalysedspeciestoalarge354
extent,althoughfewindividualsappearedoutsidetheirexpectedclades.Thelargest355
departurefrommonophylywasobservedforC.lophotainthesmallesttreeconsistingof356
29loci(Fig.2).Thetreesinferredforeachclusterhadbranchsupportsforinterspecific357
nodeslargerthan95%,anddifferedsubstantiallyintermsoftopologyandbranch358
lengths.Onlyonetree,withthelargestnumberofloci(i.e.,58)supportedtheonlyclade359
thatwassupportedbybothRAMLandSVDquartetsanalysis(C.macropyga,(C.360
dentifera,C.trollii)).Exceptthat,theinterspecificrelationshipsretrievedwitheachof361
thetreeClbinsweredifferentthanwiththeconcatenatedRAxMLanalysisand362
SVDquartetsanalysis(Fig.1b,c).363
17
3.5Structureanalysis364
Wefoundlowlevelsofcontemporaryintrogression,asshownbySTRUCTUREanalysis.365
ThemostlikelyKnumberofSTRUCTUREgroupswasconsistentwiththenumberof366
morphologicalspecies(7),andallsampleswereassignedtotheir‘correct’367
morphospecies(Fig.1d).Alsoforlowernumbersofclusters,wedidnotobserve368
signaturesofintrogression(Fig.S5). 369
4.Discussion370
4.1UtilityandlimitsofRAD-sequencingforresolvingphylogenyof371
a„difficult”genus372
RAD-sequencingsuccessfullydiscriminatedallformallydescribedEuropean373
Chiastochetaspecies.Therobustspeciesdelineationisstrikingwhencomparedto374
mtDNA-basedtreesthatfailedtosupportmonophylyofC.inermella,C.macropyga,C.375
setifera,andC.trollii(Fig.1aandFig.S4a;seealso:Desprésetal.2002;Espíndolaetal.376
2012).Theabilitytorecoverpreviouslydefinedmorphologicalspeciesinourdataset,377
whateveranalysismethodused(i.e.,maximum-likelihoodphylogeneticreconstruction378
usingaconcatenatedmatrixwithRAxML,coalescence-basedphylogeneticinference379
withSVDquartets,orpopulation-geneticsclusteringwithSTRUCTURE),supportsthe380
resultsofaprevioussimulationstudybyHovmölleretal.(2013),thathighamountsof381
missingdata,typicalforRAD-baseddatasets,shouldnotinterferewithclade(orcluster)382
identification.Recently,similarconclusionsweredrawnbyEatonetal.(2016)383
concerningtheSVDquartetsmethod.384
18
Incontrast,noconsensuscouldbereachedinretrievinginter-specificrelationships.385
WhereasRAxMLidentifiedrelationshipswithhighbootstrapsupportinfourofthefive386
possibleinterspecificrelationships,onlytwoofthemwerealsosupportedbythe387
SVDquartetsanalysis(Fig.1b,c).Incongruenceinthephylogeneticsignalsassociated388
withdifferentsetsoflocicouldexplainthedifficultyinresolvingtheseinterspecific389
relationships.WhenperforminglocibinningusingtreeCL(Gorietal.2016),wefound390
outthatdifferentsubsetsofloci(inourcase,theoptimalnumberofbinswasequalto391
four)produceddifferenttopologies,whilestillbeinglargelycongruentinthesample392
assignmentintospecies(Fig.2).393
Shortinterspecificbranchesintheresolvedphylogeniesconfirmtheconclusionsof394
Espíndolaetal.(2012)thatmostofthespeciesfromtheChiastochetagenusunderwent395
arecent(lessthan1.6Mya),rapidradiation.Theseresultshighlightthefactthatinsuch396
casesitmaybeimpossibletoretrievesomeofthephylogeneticrelationshipsamongthe397
taxaasfullybifurcatingtree,becausegenetreesmaydepictdifferentevolutionary398
historiesduetoincompletelineagesorting(Aviseetal.2008;Maddison1997).Thisisa399
limitationsharedwithclassicalmarkers(Walshetal.1999)andotherNGSapproaches400
(seebelow),pointingtoapossibleconstitutivelimitationinresolvingrapidradiations.401
Inrapidlydivergingtaxa,eventhelargenumberofnuclearmarkers,whilebeingmore402
successfulhereinrecoveringspeciesboundariesthanmitochondrialmarkersmaynot403
beinformative-enoughtoretrieveallinterspecificevolutionaryrelationships.404
TheextenttowhichtheabovelimitationistheresultoftechnicalconstraintsofRAD405
datasetsoratruebiologicallimitationremainstobeinvestigated.RAD-seqtargets406
random,mostlyneutralpartsofthegenome.Thisresultsinhighnumberoflineage-407
specificmutationsthatbearastrongsignaltodelineatespeciesorpopulations–within408
19
thesefast-evolvingpartsofthegenome,evenvaryingallelecopy-numbers(i.e.recent409
paralogs)canappearaspopulation-specific(Mastretta-Yanesetal.2014).The410
downsideishowever,missingdataincreasesrapidlywithevolutionarydistanceasa411
resultofthelossofrestrictionsites(Cariouetal.2013; Chattopadhyayetal.2014;412
DaCosta&Sorenson2016;Gautieretal.2013;Rubinetal.2012;Wagneretal.2013).413
Forinstance,Leachéetal.(2015)founddifferencesbetweenphylogeniesobtained414
usingRAD-seqvs.targetenrichmenttechniques,whereasotherstudieshaveshownthe415
agreementamongdatatypes(Mantheyetal.2016).Thelattertechniquesrelyon416
captureofapredefined(Fairclothetal.2014;McCormacketal.2012)orrandom417
(Suchanetal.2016,Schmidetal.2017)subsetofloci.Bynotrelyingonthepresenceof418
restrictionsites,andthushavinglessmissingdata,enrichmenttechniquesmaybe419
bettersuitedforbroaderphylogeneticscales.420
Nevertheless,ithasbeenshownthatevenwithhundredsofconservedloci,known421
substitutionmodelsandseveralindividualsperspecies,treeswithshortbranchesare422
difficulttoresolve,andMLanalysesbasedonconcatenatedsequencesmayprovidehigh423
bootstrapvaluesdespiteincorrectlyresolvedtopologies(Giarla&Esselstyn2015;424
Kubatko&Degnan2007;butseeGatesy&Springer2013;Springer&Gatesy2016;Roch425
&Warnow2015).Thisisexemplifiedbyourstudy,inwhichusingallRADloci,we426
obtainedaMLphylogenywithhighlysupportedinterspecificnodes,whereas427
coalescence-basedphylogeneticinferencedidnotshowstrongsupportsformostofthe428
interspecificrelationships.Ourexplorationofexplanationsforsuchadiscrepancyusing429
thelocibinningapproachshowedsupportforatleastfourdifferentunderlyinggene430
treetopologies.Intheseanalyses,wereducedthedatasettoanon-randomsetofloci431
whenfilteringforhighlocicoverageamongsamples.Theretainedloci,presentinat432
20
least100analyzedindividuals,andwithlessphylogenetically-structuredmissingdata433
(seeFig.S3b),shouldbecharacterizedbylowermutationratesorbeingunder434
stabilizingselection(Huang&Knowles2014).Usingbinning,thebestfittothedatawas435
notobtainedwithasinglebinoflocibutwithfour.Wecouldthereforenotidentifyone436
singleevolutionaryhistoryoftheChiastochetagenus,butratherequally-supported437
genetreestopologies.Importantly,thesedifferenttopologiescannotbeattributedtoa438
fewoutlierloci,astheirdistributionwasrelativelyevenacrosstheclusters(29,58,42439
and47loci;Fig.S6),incongruenceamongthesesetspossiblyimpactingmaximum-440
likelihoodphylogeneticreconstructionusingaconcatenatedmatrixandcoalescence-441
basedphylogeneticinference.Wehavealsoconfirmedthatinsuchcases,MLmethods442
provideelevatedbootstrapsupportvalues,andthatlowerbootstrapsupportvalues443
resultingfromcoalescence-basedmethodsmaybetterreflectthebiologicaluncertainty444
ofinterspecificrelationships.445
4.2MitonucleardiscordanceinthephylogenyofChiastocheta446
WhileourRAD-sequencedatasetdelineatedsevenclades,withfullagreementwiththe447
morphologicalassignments,mitochondrialdatafailedtosupportspeciesmonophyly,448
exceptforC.rotundiventrisand,toalesserextent,C.lophotaandC.dentifera.Theother449
remainingspecies:C.inermella,C.setiferaandC.trolliiformedalargecladewiththe450
speciesextensivelyinterdispersedandwiththecladeconsistingmostlyofC.macropyga451
nestedwithin(Fig.1a).Despite,onaverage,mitochondrialmarkersshouldbemore452
suitedforcapturingrelationshipsamongrecentlydivergedlineages,duetoaneffective453
populationsizefourtimeslessthanthatofnucleargenes(assumingneutralprocesses,454
equalsexratios,andunbiasedmatingsystems),andthusshortercoalescencetimes455
21
(Zink&Barrowclough2008),analyzingalargedatasetofnuclearmarkersprovided456
morepowertodiscriminatethespeciesinourcase.457
Mitonucleardiscordancepatternscanbeexplainedeitherbythedifferentbiological458
propertiesofmitochondrialDNA(vegetativesegregation,uniparentalinheritance,459
intracellularselection,andreducedrecombination;Birky2001)ordifferencesinthe460
evolutionaryhistoriesofnuclearandmitochondrialmarkers[e.g.,directselectionon461
themitochondrialgenes(Ballardetal.2007;Ballard&Pichaud2014;Boratyńskietal.462
2014;Dowlingetal.2008),incompletelineagesorting,historicalorongoinggeneflow463
amongspecies,orhybridspeciation].Indeed,ithasbeenshownbeforethatrelyingona464
single,non-recombiningmtDNAmoleculemayprovideamisleadingaccountofa465
specieshistory(e.g.,Ballard&Whitlock2004;Govindarajuluetal.2015;Phillipsetal.466
2013;Seehausenetal.2003;andreviewsbyFunk&Omland2003;Rubinoff&Holland467
2005).Whileinvestigatingthereasonsforthemito-nucleardiscordancewasnotwithin468
thescopeofthispaper,wecouldrejectthehypothesisofacontemporarygeneflowor469
hybridoriginofthetaxaasresponsibleforthispattern.Wedidnotdetectsignatureofa470
geneticmosaicinthe—mostly—nuclearRADdata,whichwouldbeexpectedinthecase471
ofhybridorigin(Ballard2000;Brelsfordetal.2011;Mallet2007;Pollardetal.2006).472
UsingRAD-sequencingdata,theassignmentofsamplesintospecieswasconcordant473
withmorphology(Fig.1b,candS4b,c)andwedidnotdetectsignificantlevelsof474
contemporarygeneflowusingpopulationgenetics-basedapproaches(Fig.1d),despite475
apparentopportunitiesforhybridization.MostofChiastochetaoccurinsympatry(Fig.476
S1),theyalsohaveverysimilarbiologies,reproducingandspendingmostoftheirtime477
onorinsideflowersofTrolliuseuropaeus(Pellmyr1989;Suchanetal.2015).Althougha478
temporalsequenceinovipositionhasbeenobserved(Després&Jaeger1999;479
22
Johannesen&Loeschcke1996;Pellmyr1989),mostspeciesco-occurtemporally.480
Despitetheseecologicalsimilaritiesandtherelativelyyoungageofthegenus(mostof481
thecladesemerginglessthan1.6Ma;Espindolaetal.2012),alackofnuclearevidence482
forhybridizationindicatesstrongcontemporaryreproductivebarriersamongthe483
species.484
5.Conclusions485
ThisstudydemonstrateshowacombinationofRAD-seqandmtDNAdatacanprovide486
insightsintophylogeniesofgenerathatarepoorlyresolvedusingmitochondrial487
markersaloneandrevealcomplexpictureofmitonucleardiscordance.Italso488
underlinesthelimitsofRAD-seq-basedphylogeniesincaseofrapidradiations.Our489
resultsshowthatascenarioofrapidradiationcanaffectmanylociacrossthegenome,490
leadingtodiscordantgenetrees,evenwhenusingmethodscontrollingforincomplete491
lineagesorting.Thismaypointtoaninherentlimitationofusingmolecularmarkersto492
resolverapidradiations,atleastatsomeoftheinter-specificrelationships,andsuggests493
thatthislimitationisnotnecessarilyduetotechnicalissues(e.g.lownumberofshared494
markers).495
Addingtothebodyofexamplesofmito-nucleardiscordance(reviewedinToews&496
Brelsford2012),ourstudywarnsagainstrelyingsolelyonmitochondrialmarkers(e.g.,497
COIbarcoding;Herbertetal.2003)forspeciesdelimitation,especiallywhentheyshow498
incongruencewithclassicaltaxonomy.Inthecasepresentedhere,mitochondrial499
markerssuggestedpoly-orparaphylyformostspecies,andproposedtheneedto500
reviewthetaxonomyofthegenus(Espíndolaetal.2012).Whentackledfromthe501
genomicpointofview,thegeneticsupportofspeciesstatusforthesesevenentitieswas502
23
confirmed.Finally,weprovideanexampleofhowMLphylogeniesbasedonlarge503
concatenateddatasetscanprovideerroneouslyhighbootstrapsupportsforincorrector504
uncertaintopologies(Giarla&Esselstyn2015;Kubatko&Degnan2007).505
Acknowledgments506
TheauthorswouldliketothankY.Triponez,N.Magrou,P.Lazarevic,D.Gyurova,R.507
Lavigne,L.Juillerat,N.Villard,andR.ArnouxfortheirhelpduringthefieldworkandV.508
Michelsenforhisgreathelpwithspeciesdetermination.WealsothankJ.OllertonandJ.509
Pannellforinsightfuldiscussionsandcommentsonthemanuscript.Thisworkwas510
supportedbythegrant‘FondsdesDonations’oftheUniversityofNeuchâtel511
(Switzerland)attributedtoAE.TSacknowledgesfundingfromtheRectors’Conference512
oftheSwissUniversities(CRUS)throughthe“ScientificExchangeProgrambetween513
SwitzerlandandthenewEUmemberstates”(SciexNMSgrantno.10.116)andfrom514
SociétéAcadémiqueVaudoise(Switzerland).ThisworkwasfundedbytheSwiss515
NationalScienceFoundation(grants3100A0-116778,PZ00P3-126624,516
PP00P3_144870toN.Alvarez,andgrantPP00P3_150654toCD)andbyagrantfrom517
SwitzerlandthroughtheSwissContributiontotheenlargedEuropeanUnion(Polish-518
SwissResearchProgram,projectno.PSPB-161/2010). 519
24
References520
AndrewsKR,GoodJM,MillerMR,LuikartG,HohenlohePA(2016)Harnessingthe521
powerofRADseqforecologicalandevolutionarygenomics.NatureReviews522
Genetics,17,81-92.523
AnisimovaM,GilM,DufayardJ-F,DessimozC,GascuelO(2011)SurveyofBranch524
SupportMethodsDemonstratesAccuracy,Power,andRobustnessofFast525
Likelihood-basedApproximationSchemes.SystematicBiology,60,685–699.526
AviseJC,RobinsonTJ,KubatkoL(2008)Hemiplasy:anewterminthelexiconof527
phylogenetics.SystematicBiology,57(3),503-507.528
BairdNA,EtterPD,AtwoodTS,CurreyMC,ShiverAL,LewisZA,SelkerEU,CreskoWA,529
JohnsonEA(2008)RapidSNPdiscoveryandgeneticmappingusingsequencedRAD530
markers.PLoSONE,3,e3376.531
BallardJWO(2000)Whenoneisnotenough:introgressionofmitochondrialDNAin532
Drosophila.MolecularBiologyandEvolution,17,1126-1130.533
BallardJWO,MelvinRG,KatewaSD,MaasK(2007)MitochondrialDNAvariationis534
associatedwithmeasurabledifferencesinlife-historytraitsandmitochondrial535
metabolisminDrosophilasimulans.Evolution,61,1735-1747.536
BallardJWO,PichaudN(2014)MitochondrialDNA:morethananevolutionary537
bystander.FunctionalEcology,28,218-231.538
BallardJWO,WhitlockMC(2004)Theincompletenaturalhistoryofmitochondria.539
MolecularEcology,13,729-744.540
25
BirkyCW(2001)Theinheritanceofgenesinmitochondriaandchloroplasts:laws,541
mechanisms,andmodels.AnnualReviewofGenetics,35,125-48.542
BoratyńskiZ,Melo-FerreiraJ,AlvesPC,BertoS,KoskelaE,PentikäinenOT,MappesT543
(2014)Molecularandecologicalsignsofmitochondrialadaptation:consequences544
forintrogression.Heredity,113,277-286.545
BrelsfordA,MiláB,IrwinDE(2011)HybridoriginofAudubon’swarbler.Molecular546
Ecology,20,2380-2389.547
CariouM,DuretL,CharlatS(2013)IsRAD-seqsuitableforphylogeneticinference?An548
insilicoassessmentandoptimization.EcologyandEvolution,3,846-852.549
ChattopadhyayB,GargKM,RamakrishnanU(2014)Effectofdiversityandmissingdata550
ongeneticassignmentwithRAD-Seqmarkers.BMCresearchnotes,7,841.551
ChifmanJ,KubatkoL(2014)QuartetinferencefromSNPdataunderthecoalescent552
model.Bioinformatics,30:3317-3324.553
CollinJE(1954)ThegenusChiastochetaPokorny(Diptera:Anthomyiidae).Proceedings554
oftheRoyalEntomologicalSocietyLondon(B),23,95-102.555
ComboschDJ,VollmerSV(2015)Trans-PacificRAD-Seqpopulationgenomicsconfirms556
introgressivehybridizationinEasternPacificPocilloporacorals.Molecular557
PhylogeneticsandEvolution,88,154-162.558
CruaudA,GautierM,GalanM,FoucaudJ,SaunéL,GensonG,RasplusJY(2014)559
EmpiricalassessmentofRADsequencingforinterspecificphylogeny.Molecular560
BiologyandEvolution,31,1272-1274.561
26
DaCostaJM,SorensonMD(2016)ddRAD-seqphylogeneticsbasedonnucleotide,indel,562
andpresence–absencepolymorphisms:Analysesoftwoaviangenerawith563
contrastinghistories.MolecularPhylogeneticsandEvolution,94,122-135.564
DaveyJL,BlaxterMW(2010)RADSeq:Next-generationpopulationgenetics.Briefingsin565
FunctionalGenomics,9,416-423.566
DaveyJW,HohenlohePA,EtterPD,BooneJQ,CatchenJM,BlaxterML(2011)Genome-567
widegeneticmarkerdiscoveryandgenotypingusingnext-generationsequencing.568
NatureReviewsGenetics,12,499-510.569
DeFilippisVR,MooreWS(2000)Resolutionofphylogeneticrelationshipsamong570
recentlyevolvedspeciesasafunctionofamountofDNAsequence:Anempirical571
studybasedonwoodpeckers(Aves:Picidae).MolecularPhylogeneticsand572
Evolution,16,143-160.573
Degnan,J.H.,Rosenberg,N.A.,2009.Genetreediscordance,phylogeneticinferenceand574
themultispeciescoalescent.TrendsinEcologyandEvolution24,332–340.575
DesprésL,JaegerN(1999)Evolutionofovipositionstrategiesandspeciationinthe576
globeflowerfliesChiastochetaspp.(Anthomyiidae).JournalofEvolutionaryBiology,577
12,822-831.578
DesprésL,PettexE,PlaisanceV,PompanonF(2002)Speciationintheglobeflowerfly579
Chiastochetaspp.(Diptera:Anthomyiidae)inrelationtohostplantspecies,580
biogeography,andmorphology.MolecularPhylogeneticsandEvolution,22,258-268.581
DowlingDK,FribergU,LindellJ(2008)Evolutionaryimplicationsofnon-neutral582
mitochondrialgeneticvariation.TrendsinEcologyandEvolution,23,546-554.583
27
EarlDA,vonHoldtBM(2012)STRUCTUREHARVESTER:awebsiteandprogramfor584
visualizingSTRUCTUREoutputandimplementingtheEvannomethod.Conservation585
GeneticsResources,4,359-361.586
EatonDAR(2014)PyRAD:assemblyofdenovoRADseqlociforphylogeneticanalyses.587
Bioinformatics,30,1844-1849.588
EatonDAR,ReeRH(2013)Inferringphylogenyandintrogressionusinggenomic589
RADseqdata:Anexamplefromfloweringplants(Pedicularis:Orobanchaceae).590
SystematicBiology,62,689-706.591
Eaton,D.A.,Spriggs,E.L.,Park,B.,&Donoghue,M.J.(2016).MisconceptionsonMissing592
DatainRAD-seqPhylogeneticswithaDeep-scaleExamplefromFloweringPlants.593
SystematicBiology,syw092.594
EdgarRC(2004)MUSCLE:multiplesequencealignmentwithhighaccuracyandhigh595
throughput.NucleicAcidsResearch32,1792-1797.596
EscuderoM,EatonDA,HahnM,HippAL(2014)Genotyping-by-sequencingasatoolto597
inferphylogenyandancestralhybridization:AcasestudyinCarex(Cyperaceae).598
MolecularPhylogeneticsandEvolution,79,359-367.599
EspíndolaA,BuerkiS,AlvarezN(2012)Ecologicalandhistoricaldriversof600
diversificationintheflygenusChiastochetaPokorny.MolecularPhylogeneticsand601
Evolution,63,466-474.602
EvannoG,RegnautS,GoudetJ(2005)Detectingthenumberofclustersofindividuals603
usingthesoftwareSTRUCTURE:asimulationstudy.MolecularEcology,14,2611-604
2620.605
28
FairclothBC,BranstetterMG,WhiteND,BradySG(2014)Targetenrichmentof606
ultraconservedelementsfromarthropodsprovidesagenomicperspectiveon607
relationshipsamongHymenoptera.MolecularEcologyResources,15,489-501608
FunkDJ,OmlandKE(2003)Species-levelparaphylyandpolyphyly:frequency,causes,609
andconsequences,withinsightsfromanimalmitochondrialDNA.AnnualReviewof610
Ecology,EvolutionandSystematics,34,397-423.611
Gatesy,J,SpringerMS(2013)Concatenationversuscoalescenceversus612
“concatalescence”.ProceedingsoftheNationalAcademyofSciences,110,E1179-613
E1179.614
GautierM,GharbiK,CezardT,FoucaudJ,KerdelhuéC,PudloP,CornuetJ-M,EstoupA615
(2013)TheeffectofRADalleledropoutontheestimationofgeneticvariation616
withinandbetweenpopulations.MolecularEcology,22,3165-3178.617
GiarlaTC,EsselstynJA(2015)Thechallengesofresolvingarapid,recentradiation:618
Empiricalandsimulatedphylogenomicsofphilippineshrews.SystematicBiology,619
64,727-740.620
GoriK,SuchanT,AlvarezN,GoldmanN,DessimozC(2016)Clusteringgenesofcommon621
evolutionaryhistory.MolecularBiologyandEvolution,33,1590–1605622
GovindarajuluR,ParksM,TennessenJA,ListonA,AshmanTL(2015)Comparisonof623
nuclear,plastid,andmitochondrialphylogeniesandtheoriginofwildoctoploid624
strawberryspecies.AmericanJournalofBotany,102,544-554.625
29
HarveyMG,JudyCD,SeeholzerGF,MaleyJM,GravesGR,BrumfieldRT(2015)Similarity626
thresholdsusedinDNAsequenceassemblyfromshortreadscanreducethe627
comparabilityofpopulationhistoriesacrossspecies.PeerJ,3,e895.628
HarveyMG,SmithBT,GlennTC,FairclothBC,BrumfieldRT(2016)Sequencecapture629
versusrestrictionsiteassociatedDNAsequencingforshallowsystematics.630
SystematicBiology,65,910-924.631
HebertPDN,CywinskaA,BallSL,deWaardJR(2003)Biologicalidentificationsthrough632
DNAbarcodes.ProceedingsoftheRoyalSocietyB:BiologicalSciences,270,313-322.633
HennigW(Ed.)(1976)Anthomyiidae.DieFliegenderPalaearktischenRegion.Stuttgart,634
E.Schweizerbart.635
HerreraS,ShankTM(2016)RADsequencingenablesunprecedentedphylogenetic636
resolutionandobjectivespeciesdelimitationinrecalcitrantdivergenttaxa.637
MolecularPhylogeneticsandEvolution,100,70-79.638
HippAL,EatonDAR,Cavender-BaresJ,FitzekE,NipperR,ManosPS(2014)A639
frameworkphylogenyoftheamericanoakcladebasedonsequencedRADdata.640
PLoSONE,9,e93975.641
HovmöllerR,KnowlesLL,KubatkoLS(2013)Effectsofmissingdataonspeciestree642
estimationunderthecoalescent.MolecularPhylogeneticsandEvolution,69,1057-643
1062.644
HuangH,KnowlesLL(2014)Unforeseenconsequencesofexcludingmissingdatafrom645
next-generationsequences:simulationstudyofRADsequences.SystematicBiology,646
doi:10.1093/sysbio/syu046.647
30
IlutDC,NydamML,HareMP(2014)Defininglociinrestriction-basedreduced648
representationgenomicdatafromnonmodelspecies:Sourcesofbiasand649
diagnosticsforoptimalclustering.BioMedResearchInternational,2014,675158.650
JohannesenJ,LoeschckeV(1996)Distribution,abundanceandovipositionpatternsof651
fourcoexistingChiastochetaspecies(Diptera:Anthomyiidae).JournalofAnimal652
Ecology,65,567-576.653
JonesJC,FanS,FranchiniP,SchartlM,MeyerA(2013)Theevolutionaryhistoryof654
Xiphophorusfishandtheirsexuallyselectedsword:agenome-wideapproachusing655
restrictionsite-associatedDNAsequencing.MolecularEcology,22,2986-3001.656
KubatkoLS,DegnanJH(2007)Inconsistencyofphylogeneticestimatesfrom657
concatenateddataundercoalescence.SystematicBiology,56,17-24.658
LanfearR,FrandsenPB,WrightAM,SenfeldT,CalcottB(2017)PartitionFinder2:new659
methodsforselectingpartitionedmodelsofevolutionformolecularand660
morphologicalphylogeneticanalyses.MolecularBiologyandEvolution.34,772-773.661
LeachéAD,ChavezAS,JonesLN,GrummerJA,GottschoAD,LinkemCW(2015)662
Phylogenomicsofphrynosomatidlizards:conflictingsignalsfromsequencecapture663
versusrestrictionsiteassociatedDNAsequencing.GenomeBiologyandEvolution,7,664
706-719.665
LeachéAD,HarrisRB,RannalaB,YangZ(2013)Theinfluenceofgeneflowonspecies666
treeestimation:asimulationstudy.SystematicBiology,63,17-30.667
LowryDB,HobanS,KelleyJL,LotterhosKE,ReedLK,AntolinMF,StorferA(2017)668
BreakingRAD:anevaluationoftheutilityofrestrictionsite-associatedDNA669
31
sequencingforgenomescansofadaptation.MolecularEcologyResources,17,142–670
152.671
MaddisonWP(1997)Genetreesinspeciestrees.SystematicBiology,46,523-536.672
MaddisonWP,KnowlesLL(2006)Inferringphylogenydespiteincompletelineage673
sorting.SystematicBiology,55,21-30.674
MalletJ(2007)Hybridspeciation.Nature,446,279-283.675
MantheyJD,CampilloLC,BurnsKJ,MoyleRG(2016)Comparisonoftarget-captureand676
restriction-siteassociatedDNAsequencingforphylogenomics:atestincardinalid677
tanagers(Aves,Genus:Piranga).Systematicbiology,syw005.678
Mastretta-YanesA,ZamudioS,JorgensenTH,ArrigoN,AlvarezN,PiñeroD,EmersonBC679
(2014)Geneduplication,populationgenomics,andspecies-leveldifferentiation680
withinatropicalmountainshrub.Genomebiologyandevolution,6,2611-2624.681
Mastretta-YanesA,ArrigoN,AlvarezN,JorgensenTH,PiñeroD,EmersonBC(2015)682
Restrictionsite-associatedDNAsequencing,genotypingerrorestimationandde683
novoassemblyoptimizationforpopulationgeneticinference.MolecularEcology684
Resources,15,28-41.685
McCormackJE,FairclothBC,CrawfordNG,GowatyPA,BrumfieldRT,GlennTC(2012)686
Ultraconservedelementsarenovelphylogenomicmarkersthatresolveplacental687
mammalphylogenywhencombinedwithspeciestreeanalysis.GenomeResearch,688
22,746-754.689
32
McKinneyGJ,LarsonWA,SeebLW,SeebJE(2017)RADseqprovidesunprecedented690
insightsintomolecularecologyandevolutionarygenetics:commentonBreaking691
RADbyLowryetal.(2016).MolecularEcologyResources,doi:10.1111/1755-692
0998.12649.693
MichelsenV(1985)ArevisionoftheAnthomyiidae(Diptera)describedbyJ.W.694
Zetterstedt.Steenstrupia,11,37-65695
MillerMA,PfeifferW,SchwartzT(2010)CreatingtheCIPRESScienceGatewayfor696
inferenceoflargephylogenetictrees.In:ProceedingsoftheGatewayComputing697
EnvironmentsWorkshop(GCE),14Nov.2010,NewOrleans,LApp.1-8.698
NadeauNJ,MartinSH,KozakKM,SalazarC,DasmahapatraK,DaveyJW,BaxterSW,699
BlaxterML,MalletJ,JigginsCD(2013)Genome-widepatternsofdivergenceand700
geneflowacrossabutterflyradiation.MolecularEcology,22,814–826.701
PanteE,AbdelkrimJ,ViricelA,GeyD,FranceSC,BoisselierMC,SamadiS(2015)Useof702
RADsequencingfordelimitingspecies.Heredity,114,450-459.703
ParadisE,ClaudeJ,StrimmerK(2004)APE:analysesofphylogeneticsandevolutionin704
Rlanguage.Bioinformatics,20,289-290.705
PellmyrO(1989)Thecostofmutualism–interactionsbetweenTrolliuseuropaeusand706
itspollinatingparasites.Oecologia,78,53-59.707
PellmyrO(1992)Thephylogenyofamutualism:evolutionandcoadaptationbetween708
Trolliusanditsseed-parasiticpollinators.BiologicalJournaloftheLinneanSociety,709
47,337-365.710
33
PetersonBK,WeberJN,KayEH,FisherHS,HoekstraHE(2012)DoubledigestRADseq:711
AninexpensivemethodfordenovoSNPdiscoveryandgenotypinginmodeland712
non-modelspecies.PLoSONE,7,e37135.713
PhillipsMJ,HaoucharD,PrattRC,GibbGC,BunceM(2013)InferringKangaroo714
PhylogenyfromIncongruentNuclearandMitochondrialGenes.PLoSONE,8,715
e57745.716
PollardDA,IyerVN,MosesAM,EisenMB(2006)Widespreaddiscordanceofgenetrees717
withspeciestreeinDrosophila:Evidenceforincompletelineagesorting.PLoS718
Genetics,2,e173.719
PritchardJK,StephensM,DonnellyP(2000)Inferenceofpopulationstructureusing720
multilocusgenotypedata.Genetics,155,945-959.721
ReazR,BayzidMS,RahmanMS(2014)Accuratephylogenetictreereconstructionfrom722
quartets:Aheuristicapproach.PloSOne,9:e104008.723
RochS,WarnowT(2015)Ontherobustnesstogenetreeestimationerror(orlack724
thereof)ofcoalescent-basedspeciestreemethods.SystematicBiology,syv016.725
RokasA,CarrollSB(2005)Moregenesormoretaxa?Therelativecontributionofgene726
numberandtaxonnumbertophylogeneticaccuracy.MolecularBiologyand727
Evolution,22,1337-1344.728
RubinBER,ReeRH,MoreauCS(2012)InferringphylogeniesfromRADsequencedata.729
PLoSONE,7,e33394.730
34
RubinoffD,HollandBS(2005)Betweentwoextremes:mitochondrialDNAisneitherthe731
panaceanorthenemesisofphylogeneticandtaxonomicinference.Systematic732
Biology,54,952-961.733
SchmidS,GenevestR,GobetE,SuchanT,SperisenC,TinnerW,AlvarezN(2017)734
HyRAD-X,aversatilemethodcombiningexomecaptureandRADsequencingto735
extractgenomicinformationfromancientDNA.MethodsinEcologyandEvolution,736
doi:10.1111/2041-210X.12785737
SeehausenO,KoetsierE,SchneiderMV,ChapmanLJ,ChapmanCA,KnightME,Turner738
GF,vanAlphenJJM,BillsR(2003)Nuclearmarkersrevealunexpectedgenetic739
variationandaCongolese-NiloticoriginoftheLakeVictoriacichlidspeciesflock.740
ProceedingsoftheRoyalSocietyofLondonSeriesB,270,129-137.741
SpringerMS,GatesyJ(2016)Thegenetreedelusion.MolecularPhylogeneticsand742
Evolution,94,1-33.743
StamatakisA(2014)RAxMLVersion8:Atoolforphylogeneticanalysisandpost-744
analysisoflargephylogenies.Bioinformatics,30,1312-1313.745
SuchanT,BeauverdM,TrimN,AlvarezN(2015)AsymmetricalnatureoftheTrollius-746
Chiastochetainteraction:insightsintotheevolutionofnurserypollinationsystems.747
EcologyandEvolution,5,4766-4777.748
SuchanT,PitteloudC,GerasimovaNS,KostikovaA,SchmidS,ArrigoN,PajkovicM,749
RonikierM,AlvarezN(2016)HybridizationCaptureUsingRADProbes(hyRAD),a750
NewToolforPerformingGenomicAnalysesonCollectionSpecimens.PLoSONE,11,751
e0151651.752
35
SuchanT,EspíndolaA,RutschmannS,EmersonBC,GoriK,DessimozC,ArrigoN,Ronikier753
M,AlvarezN(2017)Datafrom:AssessingthepotentialofRAD-sequencingtoresolvephy-754
logenetic relationshipswithinspeciesradiations: the flygenusChiastocheta (Diptera:An-755
thomyiidae)asacasestudy.MendeleyData,https://doi.org/XXX756
SwoffordDL(2002)PAUP*:Phylogeneticanalysisusingparsimony(*andother757
methods).Version4.Sinauer,Sunderland,Massachusetts,USA.758
ToewsDP,BrelsfordA(2012)Thebiogeographyofmitochondrialandnuclear759
discordanceinanimals.MolecularEcology,21,3907-3930.760
TownsendTM,MulcahyDG,NoonanBP,SitesJW,KuczynskiCA,WiensJJ,ReederTW761
(2011)Phylogenyofiguanianlizardsinferredfrom29nuclearloci,anda762
comparisonofconcatenatedandspecies-treeapproachesforanancient,rapid763
radiation.MolecularPhylogeneticsandEvolution,61,363-380.764
WagnerCE,KellerI,WittwerS,SelzOM,MwaikoS,GreuterL,SivasundarA,Seehausen765
O(2013)Genome-wideRADsequencedataprovideunprecedentedresolutionof766
speciesboundariesandrelationshipsintheLakeVictoriacichlidadaptiveradiation.767
MolecularEcology,22,787–798.768
WalshHE,KiddMG,MoumT,FriesenVL(1999)Polytomiesandthepowerof769
phylogeneticinference.Evolution,53,932-937.770
WhitfieldJB,KjerKM(2008)Ancientrapidradiationsofinsects:challengesfor771
phylogeneticanalysis.AnnualReviewofEntomology,53,449-472.772
WhitfieldJB,LockhartPJ(2007)Decipheringancientrapidradiations.Trendsin773
EcologyandEvolution,22,258-265.774
36
WielstraB,ArntzenJW,vanderGaagKJ,PabijanM,BabikW(2014)DataConcatenation,775
BayesianConcordanceandCoalescent-BasedAnalysesoftheSpeciesTreeforthe776
RapidRadiationofTriturusNewts.PLoSONE,9,e111011.777
WilliamsJS,NiedzwieckiJH,WeisrockDW(2013)Speciestreereconstructionofa778
poorlyresolvedcladeofsalamanders(Ambystomatidae)usingmultiplenuclear779
loci.MolecularPhylogeneticsandEvolution,68,671-682.780
ZetterstedtJW(1845)Dipterascandinaviaedispositaetdescripta.IV,1281-1738.Lund,781
Sweden.782
ZinkRM,BarrowcloughGF(2008)MitochondrialDNAundersiegeinavian783
phylogeography.MolecularEcology,17,2107-2121.784
DataAccessibility785
DNAsequencesareavailableonGenbankunderaccessionsnoXXX-XXX.Nexusand786
STRUCTUREdatasetsusedfortheanalyses,MLphylogenyinferredforthemainRAD787
dataset,SVDquartetanalyses,andMLphylogenyinferredforthemtDNAdatasetare788
availableonMendeleyDatahttps://doi.org/XXX(Suchanetal.2017).789
AuthorContributions790
TS,NA,AE,KGandCDdesignedresearch,TS,AE,KG,andSRperformedresearchand791
analyzeddata.Allauthorswrotethepaper. 792
37
Figures793
794
795
Figure1.Phylogeniesobtainedfora)MLanalysisofthemtDNAdataset;b)MLanalysis796
oftheRADdataset;c)SVDquartetsanalysisofRADdataset;bootstrapnodesupports>797
80areshowndenotedbygraypoints,bootstrapnodesupports>90areshowndenoted798
byblackpoints.d)PopulationclusteringofthesampledChiastochetaspecimens,799
estimatedwithSTRUCTUREusingK=7value. 800
●●●●
●●●●
●●●●●●
●●●●●●●●
●●
●●● ●●●●●●●●●●●●
●●●●●●
●●●●●●
●●
●●●●●●
●● ●●
●●●●●●●●
●●●●
●●●●●●●●
●●●
●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●
●●● ●●●●●●
●●●
●●●●●●
●●●●●●●●●●●●●●
●
●
●●●●●●●●●● ●●● ●●●●●
●● ●●●●●●●●●●●●●●●●●●
●
●●
●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●● ●●●●●●●●●●
●●
●●●●●●
●●●
●
● ●
●
●●
● ●
●
●
●
●
●
●
● ●
●
●
●
a)
●●●●●●●
C. dentiferaC. inermellaC. lophotaC. macropygaC. rotundiventrisC. setiferaC. trollii
●●●●●●●●
●●●
●●●●●
● ●● ●●● ●●
●● ●●●●
●●●●●
●●●●
●●●●●●●
●●●
●●
● ●●●●
●●●
●●●● ●
●●
● ●●●●●●
● ●●●●
●●●●●● ●● ●●
●● ●●●
●●
●●●●
● ●●●● ●● ●●●
●●●●●
●●●●
●●●●●●
● ●●●●
●●●●●
● ●●●●
●●●●●●●
●
●●●● ●●● ●●
●●●●●●●
●
●● ●●●
●●●●●
●●
● ●●
●●●●● ●●
●●● ●● ●
● ●●●● ●
●● ●
●●●●●
●● ●●●●●
● ●●●
●● ● ●●●●●
●●
●●● ●●●●●
● ●●● ●●●●●
●●●
●●●●●●●●●●
●●●● ●●●●●
●● ●●●●●●
● ●●
●●● ●●●●●●●●●●●
●●●●
●
●
●
●
●
●●●
●●●
●
● ●●
●●● ●
●
●
●●●
●
● ●
● ●●●●●●
●
●
●
●●
●●●● ●●●●
● ●
●●
●
●
●
●
●●
●
●●
●●
●
●●●
●● ●●
b)
●
●●●●●●●●● ●
●●●●●●●●
●
● ● ●●●●● ●●
●●
●●● ●●●● ●●●●●●
● ●●●●●●●●
● ●●●
●●●●●● ●●
●
●●●●●●
●●● ●●
●●●●●●●●
●●●●●●
● ●●●●
● ● ●●
●●● ●●●●●●●●●
●●●●●●●●●
●●●●
●● ● ●●
●●● ●●●●●
●●●●● ●●●
●●●●●
● ●●● ●●
●●●●●
●●●●
●
●●
●●●●●●●●●
●●●●●
●●
●
●● ●●●●●●
●●●● ●●●●●●●
●●●●●●●●
●●●●●●●●●
●
● ●●●
●●
●●●●●●
●●●
● ●●●●●●
●● ●●●●●●
●● ●●●●●● ●●●
● ●●●●
●●●●●
● ●●●●●
● ●●●
●●● ●●
●●●●●●
●
●
● ●●
●
●
●
●●●
●●●
● ●
● ●●
●
●●
● ●
●●●
●
●
●●
●
●●
●●
● ●
●
●
●●
● ●●
●
●
●
●
●● ●
●● ●
● ●
c)c) d)
38
801
802
Figure2.Phylogenetictreesonthefourbins,asidentifiedbythetreeClanalysis,803
consideringonlythelocipresentinatleast100specimens.Bootstrapnodesupports>804
80areshowndenotedbygraypoints,bootstrapnodesupports>90areshowndenoted805
byblackpoints. 806
●●
●
● ●●
●
●
58 loci
● ●
●● ●
●●●
●
●●●●
●●
●
●
●
●●●
● ●●●
●
●
● ● 29 loci
●
●●
●
●●
●●
●●●
● ●
●
● ●● ●
●
●●
●
●●
● 42 loci
●
●●
●●● ●● ●
● ●
●● ●●● ●
●●
●
●● ●
● ●●
●
27 loci
39
Tables807
Table1.Populationsincludedinthestudy,withgeographicalcoordinatesandthe808
numberofspecimensusedinthefinalanalyses.LettercodesdenoteChiastocheta809
species:C.dentifera(D),C.inermella(I),C.lophota(L),C.macropyga(M),C.810
rotundiventris(R),C.setifera(S),andC.trollii(T).811
code site latitude longitude year D I L M R S T sum
AMB Ambri 46.50680 8.70292 2008 2 4 2 1 9 AMO Amot 59.62199 8.42346 2007 4 4 BAY Bayasse 44.30814 6.74067 2007 1 2 2 2 3 10 BEI Beistohlen 61.20761 8.95473 2007 5 5 BID Bidjovagge 69.29778 22.47808 2008 1 1 2 BON Colde
Bonnecombe 44.57557 3.11410 2007
3 1 1 2 7 BRA Braas 57.09309 15.06817 2007 1 1 CCO ColdelaCo-
lombière 45.98722 6.46972 2006
2 1 2 1 6 CDV CreuxduVan 46.93526 6.74119 2006 1 2 2 2 7 CHA Chasseral 47.12569 7.02130 2006 5 3 1 9 CHE Chemin 46.08993 7.08978 2006
3 1 3 2 1 2 12 CRA Crans-Montana 46.34650 7.53890 2006 1 1 3 1 1 7 CRE CressbrookDale 53.26724 -1.74041 2008 2 2 CTP ColtPark 54.19365 -2.35247 2008 4 1 1 6 DON Donovaly 48.88922 19.23068 2008 3 1 2 6 EID EiddaPastures 53.03720 -3.74190 2008 2 2 ELL Ellingsrudelva 59.91771 10.91844 2007 1 1 EPO Esposouille 42.62341 2.09450 2008
1 1 2 2 3 9 FRO Froson 63.18205 14.60268 2007 2 2
40
GAL ColduGalibier 45.08528 6.43861 2006 2 2 3 3 10 GLE GlenFender 56.78138 -3.79485 2008 2 2 4 HT1 HauteTinee1 44.29617 6.81871 2007 3 3 HT2 HauteTinee2 44.28426 6.85581 2007 3 1 4 KRA KrasnoPolje 44.80869 14.97271 2008 1 1 LAK Laktatjakka 68.42931 18.40674 2007 1 4 2 7 LFE LoughFern 55.06569 -7.71130 2008 1 1 LOS Loser 47.66052 13.78485 2007 4 3 1 8 MOE Moerlimatt 47.90597 8.07760 2007 1 1 1 1 4 MTP MontePizi 41.91524 14.16714 2008 2 2 NAV Naverdal 62.70417 10.13002 2007 3 1 4 PAJ PajinoPreslo 43.27799 20.81970 2008 2 2 PAN Puertode
Panderrueda 43.12743 -4.97223 2008
1 1 3 2 1 8 PIL Pila 48.90017 20.29449 2008
3 2 1 6 POD Podlesok 48.94962 20.35190 2008 1 1 2 PPN PetitPapaNoel 66.51647 25.79386 2007
1 3 2 3 9 PYD PuydeDome 45.77222 2.96333 2006 2 2 2 6 PYM PuyMary 45.11139 2.68083 2006 1 1 PYS PuydeSancy 45.53500 2.80972 2006 3 2 1 6 RAD Radkow 50.46866 16.35321 2008
2 4 2 8 RIS Risnjak-Snjeznik 45.43871 14.58494 2008 1 2 3 SAL Salla 66.83020 28.65427 2007
1 2 1 4 8 SED SededePan 43.03949 -0.48651 2008 2 2 SET Seterasen 65.53432 13.67744 2007
1 4 1 1 1 8 SOL Solberga 57.95194 13.56116 2007
4 2 2 1 9 STE Steingaden 47.59529 11.01296 2007 1 1 31 5
41
STR Straumen 67.38440 15.64921 2007 2 2 SUS Susch 46.74728 10.07473 2006
2 2 2 1 3 10 SVA Svartla 65.99583 21.22062 2007
3 3 6 TAR Tarasp 46.77730 10.25056 2006 1 2 1 3 7 VIT Vitosha 42.59032 23.29342 2008 2 2 ZAL ZaliLog 46.20342 14.11080 2008 3 2 1 6
total: 21 44 42 33 59 34 38 271
812
42
AppendixA.Supplementarymaterial813
TableS1.SummarystatisticsfortheRAD-sequencedsamples:numberofRAD814
fragmentsclustersandmeancoverageafterretainingclusterswithacoverage>5,815
estimatedheterozygosities,numberofconsensuslociafterparalogfiltering,andthe816
numbersoflociretainedforeachdatasetafterfilteringforcoverageamongthesamples.817
818
Fig.S1.MapofthesampledChiastochetaspecimensusedinthestudy.819
820
Fig.S2.Theeffectofdifferentclusteringthresholds(X-axis)andminimumloci821
coverages(indicatedbycolors:red–10,blue–20,green–100individuals)onthetotal822
numberofassembledloci,proportionofmissingdata,locioverlapamongthetechnical823
replicates,andmeannumberofindividualsperlocus.824
825
Fig.S3.PatternofRAD-seqlocisharingamongthesequencedindividualsfordatasets:826
a)themaindatasetusingclusteringsimilarityof75%andminimumlocicoverage827
amongindividualsof20;b)usingclusteringsimilarityof75%andminimumloci828
coverageamongindividualsof100.829
830
Fig.S4.a)MLphylogenyinferredforthemtDNAdataset;b)MLphylogenyinferredfor831
theRAD-seqdataset;c)SVDquartetsphylogenyinferredfortheRAD-seqdataset;832
bootstrapnodesupports>80areshowndenotedbygraypoints,bootstrapnode833
supports>90areshowndenotedbyblackpoints.834
835
Fig.S5.STRUCTURErunsforK=2to7,plottedagainsttheRAD-seqbasedphylogeny.836
43
Fig.S6.Phylogenetictreesonthelocipartitionedintothesetsof2to6clusters,837
consideringonlythelocipresentinatleast100specimens.Bootstrapnodesupports>838
80areshowndenotedbygraypoints,bootstrapnodesupports>90areshowndenoted839
byblackpoints.Numbersbelowthetreesdenotethenumberofclustersintowhichthe840
datasetwasdivided.Plotoflog-likelihoodimprovementversusthenumberofclusters841
ispresentedinthefirstbox.842
843
AppendixS1.RAD-sequencingprotocol. 844