43
1 Assessing the potential of RAD-sequencing to resolve 1 phylogenetic relationships within species radiations: the 2 fly genus Chiastocheta (Diptera: Anthomyiidae) as a case 3 study 4 5 Tomasz Suchan 1,2 , Anahí Espíndola 3* , Sereina Rutschmann 1,4* , Brent C. Emerson 5,6 , Kevin 6 Gori 7 , Christophe Dessimoz 1,8,9,10 , Nils Arrigo 1 , Michał Ronikier 2 , Nadir Alvarez 1 7 8 1 Department of Ecology and Evolution, University of Lausanne, Switzerland 9 2 W. Szafer Institute of Botany, Polish Academy of Sciences, Kraków, Poland 10 3 Department of Biological Sciences, University of Idaho, Moscow, ID, USA 11 4 Department of Biochemistry, Genetics and Immunology, University of Vigo, Spain 12 5 Island Ecology and Evolution Research Group, Instituto de Productos Naturales y 13 Agrobiología (IPNA-CSIC), La Laguna, Tenerife, Canary Islands, Spain 14 6 School of Biological Sciences, University of East Anglia, Norwich, UK 15 7 Department of Veterinary Medicine, University of Cambridge, UK 16 8 Center for Integrative Genomics, University of Lausanne, Switzerland 17 9 Department of Genetics, Evolution & Environment and Department of Computer 18 Science, University College London, UK 19 10 Swiss Institute of Bioinformatics, Lausanne, Switzerland 20 21 * these authors are considered as second co-authors 22 23 corresponding authors: 24 Tomasz Suchan ([email protected]), W. Szafer Institute of Botany, Polish Academy of 25 Sciences, Lubicz 46, 31-512 Kraków, Poland 26 Nadir Alvarez ([email protected]), Department of Ecology and Evolution, University 27 of Lausanne, Biophore Building, 1015 Lausanne, Switzerland; fax: +41 21 692 41 65 28

Assessing the potential of RAD-sequencing to resolve ...lab.dessimoz.org/papers/17_ChiastochetaphylogenyMPE.pdf · 35 generation-sequencing technologies and associated techniques

Embed Size (px)

Citation preview

1

AssessingthepotentialofRAD-sequencingtoresolve1

phylogeneticrelationshipswithinspeciesradiations:the2

flygenusChiastocheta(Diptera:Anthomyiidae)asacase3

study4

5

TomaszSuchan1,2,AnahíEspíndola3*,SereinaRutschmann1,4*,BrentC.Emerson5,6,Kevin6

Gori7,ChristopheDessimoz1,8,9,10,NilsArrigo1,MichałRonikier2,NadirAlvarez1 7

8

1DepartmentofEcologyandEvolution,UniversityofLausanne,Switzerland9 2W.SzaferInstituteofBotany,PolishAcademyofSciences,Kraków,Poland10 3DepartmentofBiologicalSciences,UniversityofIdaho,Moscow,ID,USA11 4DepartmentofBiochemistry,GeneticsandImmunology,UniversityofVigo,Spain12 5IslandEcologyandEvolutionResearchGroup,InstitutodeProductosNaturalesy13

Agrobiología(IPNA-CSIC),LaLaguna,Tenerife,CanaryIslands,Spain14 6SchoolofBiologicalSciences,UniversityofEastAnglia,Norwich,UK15 7DepartmentofVeterinaryMedicine,UniversityofCambridge,UK16 8CenterforIntegrativeGenomics,UniversityofLausanne,Switzerland17 9DepartmentofGenetics,Evolution&EnvironmentandDepartmentofComputer18

Science,UniversityCollegeLondon,UK19 10SwissInstituteofBioinformatics,Lausanne,Switzerland20

21 *theseauthorsareconsideredassecondco-authors22

23

correspondingauthors:24

TomaszSuchan([email protected]),W.SzaferInstituteofBotany,PolishAcademyof25

Sciences,Lubicz46,31-512Kraków,Poland26

NadirAlvarez([email protected]),DepartmentofEcologyandEvolution,University27

ofLausanne,BiophoreBuilding,1015Lausanne,Switzerland;fax:+41216924165 28

2

Abstract29

Determiningphylogeneticrelationshipsamongrecentlydivergedspecieshaslongbeen30

achallengeinevolutionarybiology.Cytoplasmicmarkers,whichhavebeenwidelyused31

notablyinthecontextofmolecularbarcoding,havenotalwaysprovedsuccessfulin32

resolvingsuchphylogenies,butphylogeniesforcloselyrelatedspecieshavebeen33

resolvedatamuchhigherdetailinthelastcoupleofyearswiththeadventofnext-34

generation-sequencingtechnologiesandassociatedtechniquesofreducedgenome35

representation.Hereweexaminethepotentialandlimitationsofoneofsuchtechniques36

—Restriction-siteAssociatedDNA(RAD)sequencing,amethodthatproduces37

thousandsof(mostly)anonymousnuclearmarkers,indisentanglingthephylogenyof38

theflygenusChiastocheta(Diptera:Anthomyiidae).Thisgenusencompassesseven39

describedspeciesofseedpredators,whichhavebeenwidelystudiedinthecontextof40

theirecologicalandevolutionaryinteractionswiththeplantTrolliuseuropaeus41

(Ranunculaceae).Sofar,phylogeneticanalysesusingmitochondrialmarkersfailedto42

resolvemonophylyofmostofthespeciesfromthisrecentlydiversifiedgenus,43

suggestingthattheirtaxonomymayneedtoberevised.However,relyingonasingle,44

non-recombiningmoleculeandignoringpotentialincongruencesbetween45

mitochondrialandnuclearlocimayprovideincompleteaccountofalineagehistory.In46

thisstudy,weapplybothclassicalSangersequencingofthreemtDNAregionsandRAD-47

sequencing,forreconstructingthephylogenyofthegenus.Contrastingwithresults48

basedonmitochondrialmarkers,RAD-sequencinganalysesretrievedthemonophylyof49

allsevenspecies,inagreementwiththemorphologicalspeciesassignment.Wefound50

robustnuclear-basedspeciesassignmentofindividualsamples,andlowlevelsof51

estimatedcontemporarygeneflowamongthem.However,despiterecoveringspecies’52

3

monophyly,interspecificrelationshipsvarieddependingonthesetofRADloci53

considered,producingcontradictorytopologies.Moreover,coalescence-based54

phylogeneticanalysesrevealedlowsupportsformostoftheinterspecificrelationships.55

OurresultsindicatethatdespitethehigherperformanceofRAD-sequencingintermsof56

speciestreesresolutioncomparedtocytoplasmicmarkers,reconstructinginter-specific57

relationshipsmayliebeyondthepossibilitiesofferedbylargesetsofRAD-sequencing58

markersincasesofstronggenetreeincongruence.59

60

Keywords:coalescentanalysis;DNAbarcoding;maximumlikelihood;mito-nuclear61

incongruence;singlenucleotidepolymorphisms;quartetinference62

63

4

1.Introduction64

Recentlydivergedlineagesposeaproblemfortraditionalphylogeneticapproachesthat65

typicallyrelyonasmallsetofrelativelyslowlyevolvingloci(DeFilippis2000),often66

lackingresolutionatnarrowerevolutionaryscales(Cariouetal.2013).Inaddition,67

complexprocessessuchasincompletelineagesorting(Aviseetal.2008;Maddison&68

Knowles2006;Pollardetal.2006)andgeneflowamongspecies(Leachéetal.2013)69

increaseincongruencesamonggenetreesandtopologicaldeviationsfromthespecies70

tree(Dengan&Rosenberg2009;Maddison1997).Thisisespeciallytrueforlineages71

thathaveundergonerapidradiations,inwhichancestralpolymorphismssorted72

idiosyncraticallyintothedescendanttaxathroughshortevolutionarynodes(Aviseetal.73

2008),andincaseswheresubsequentevolutionaryeventsmayblurphylogeneticsignal74

(Whitfield&Kjer2008;Whitfield&Lockhart2007).Samplingmorelocihasbeen75

showntobeapromisingapproachinsuchcases(Rokas&Carroll2005;Townsendetal.76

2011;Wielstraetal.2014;Williamsetal.2013),butthespectrumofgeneticmarkers77

developedforphylogenyestimationisstilllimited(Whitfield&Kjer2008).78

Next-generationsequencingapproaches,particularlyreducedrepresentationgenome79

sequencing(Daveyetal.2011),offerthepossibilitytosamplethousandsofgenomic80

markersfromnon-modelspecies.Amongthem,Restrictionsite-AssociatedDNA(RAD;81

Bairdetal.2008)techniquesrelyonthesequencingofshortDNAfragmentsflanking82

restrictionsites,generatingrandomanonymousgenomicmarkers,homologousacross83

theanalyzedsamples(Andrewsetal.2016;Davey&Blaxter2010).Fromaphylogenetic84

perspective,animportantaspectofRADmarkersistheriseintheproportionof‘null85

alleles’asgenomedivergenceacrosssamplesincreases.Thisphenomenoniscausedby86

randommutationsoccurringintherestrictionsitesthatdecreasethenumbersof87

5

sharedRADlociamongtaxa,resultingindatamatricescontaininglargeamountsof88

missingdata(Cariouetal.2013;Chattopadhyayetal.2014;Gautieretal.2013).89

However,usinganin-silicoapproachRubinetal.(2012)andCariouetal.(2013)have90

shownthatRAD-seqdatacanbeusedsuccessfullytoresolvespeciesrelationshipsthat91

transcendtimescalesupto60Mya(millionyearsago).ExperimentallysampledRAD92

datasetshavebeenappliedtoreconstructphylogeneticrelationships,mostlyamong93

recentlydivergedtaxa(e.g.,Eaton&Ree2013;Harveyetal.2016;Jonesetal.2013;94

Leachéetal.2015;Nadeauetal.2013;Wagneretal.2013),withfewerstudiesinvolving95

moredistantlyrelatedones,evenuptoeven80Mya(e.g.,Cruaudetal.2014;Eatonet96

al.2016;HerreraandShank2016;Hippetal.2014;Panteetal.2015).Althoughthese97

genomicdatasetsimprovedphylogeneticinferencesforgroupsthatwereambiguous98

usingclassicalmarkers(e.g.,Escuderoetal.2014;Hippetal.2014),thepotentialutility99

ofRADlociforresolvingmorecomplexphylogenetichistories,suchasthosewhere100

historicalintrogressionhasoccurredorthoseassociatedwithincompletelineage101

sorting,remainsstillpoorlyexplored(Combosch&Vollmer2015;Eaton&Ree2013).102

Moreover,theuseofRADdatasetsasmarkersforevolutionarygeneticshasrecently103

beenheavilydiscussed(Lowryetal.2017;McKinneyetal.2017).104

Inthisstudy,wetesttheutilityofRAD-sequencingtorecoverphylogenetic105

relationshipsinagenusofseedparasiticpollinatorsofTrolliuseuropaeus106

(Ranunculaceae)—fliesfromthegenusChiastochetaPokorny,1889(Diptera:107

Anthomyiidae).Here,sequencingofmitochondrialmarkersfailedtorevealthe108

monophylyandphylogeneticrelationshipsamongpreviouslymorphologically109

describedspecies(Desprésetal.2002;Espíndolaetal.2012).Thisdiscordance110

betweenmorphologyandmitochondrialphylogenyhasbeeninterpretedasacallfora111

6

taxonomicrevision,andapossiblereconsiderationofconclusionsfromprevious112

ecologicalandevolutionarystudies(Espíndolaetal.2012).However,several113

mechanismsmaycausemitochondrianottotrackspeciesevolution(Funk&Omland114

2003)and,indeed,therearemanycaseswheremitochondrialandnucleargenetrees115

havebeenshowntobeincongruent(e.g.,Govindarajuluetal.2015;Phillipsetal.2013;116

Seehausenetal.2003).Asrelyingonasingle,non-recombiningmoleculemayprovidea117

misleadingaccountofaspecieshistory(Ballard&Whitlock2004),utilizingalargeset118

ofindependentnuclearloci(sampledthroughRAD-sequencing)shouldallowustotest119

themonophylyofthemorphologicallydescribedspeciesandresolvephylogenetic120

relationshipsamongthem.Whetherornotmolecularmarkersareabletoreveal121

scenariosofrapidradiationsisstillanopenquestion(Giarla&Esselstyn2015).In122

these,identifyingasinglespeciestreemightliebeyondanalyticalpossibilitiesdueto123

pervasiveconflictsamongthegenetrees,particularlywhenpopulationsizesarelarge124

andspeciationeventshappenatahigherratethanthemutation-driftequilibrium,125

eventuallyproducingconflictingtopologies.Inordertoexploregeneandspeciestrees,126

weappliedbothaconcatenation-basedphylogeneticapproach(i.e.,RAxML;Stamatakis127

2014)andacoalescence-basedinferencemethod(i.e.,SVDquartets;Chifman&Kubatko128

2014)toaRAD-seqdatasetencompassingspecimensfrom51Europeanpopulations,129

representativeofthesevenrecognizedChiastochetamorphospecies.Inorderto130

examinetheextenttowhichdifferentcombinationsofRADlocimayproducedistinct131

speciestrees,weusedanewlydevelopedalgorithmthatperformslocibinning,using132

dissimilaritylevelsamongphylogeneticpatternsretrievedatsingleloci(treeCl;Goriet133

al.2016).Wealsoappliedpopulationgeneticsclusteringalgorithms(i.e.,STRUCTURE;134

Pritchardetal.2000)asacontrol.Eventually,wecomparedourresultstothose135

7

obtainedwithclassicalphylogeneticinferencebasedonconcatenationofthree136

mitochondrialregions.137

2.Materialsandmethods138

2.1Studysystem139

ThecenteroforiginanddiversityofChiastochetahasbeeninferredtobetheWestern140

Palearctic,wheresevenflyspeciesareinvolvedinnurserypollinationinteractionswith141

TrolliuseuropaeusL.(Espíndolaetal.2012;Pellmyr1989,1992;Suchanetal.2015).142

ThesesevenmorphologicallydelimitedEuropeanChiastochetaspecies,namelyC.143

dentiferaHennig1953;C.inermella(Zetterstedt,1838);C.lophotaKarl,1943;C.144

macropygaHennig,1953;C.rotundiventrisHennig,1953;C.setiferaHennig,1953andC.145

trollii(Zetterstedt,1838)areecologicallyverysimilarandoftensympatric(Collin1954;146

Hennig1976;Michelsen1985;Zetterstedt1845;V.Michelsenpers.comm.).Inhis147

monographofthisplant-pollinatorinteraction,Pellmyr(1992)discussedanother148

species,C.abruptiventrisasanorthernvicariantofC.rotundiventris,ataxonnot149

supportedbypreviousmolecularstudies(Espíndolaetal.2012)andneverformally150

described.AlthoughallChiastochetareproducewithintheflowersofT.europaeus,with151

potentialcross-speciesmatingpossibilities,noputativehybridshavebeenobserved152

basedongenitalmorphology(T.SuchanandA.Espíndola,pers.obs.).153

Althoughthespeciesarewelldefinedmorphologically,mitochondrialphylogenies154

recoveredonlythreemonophyleticclades–C.rotundiventris,C.dentifera,andC.lophota155

(Desprésetal.2002;Espíndolaetal.2012),andsuggestedapolyphyleticoriginforC.156

inermellaandC.setifera(Desprésetal.2002;Espíndolaetal.2012),withC.macropyga157

andC.trolliibeingparaphyletic(Espíndolaetal.2012).Moleculardatingplacedthe158

8

mostrecentcommonancestorofallEuropeanspeciesattheendofthePliocene(2-3.4159

Ma;Desprésetal.2002;Espíndolaetal.2012),andindicatedthatmostdiversification160

eventsoccurredwithinthelast1.6Ma.161

2.2Sampling162

Chiastochetaspecimensweresampledfrom51Europeanpopulationsduringspringand163

summer2006,2007,and2008(Table1;mapsonFig.S1).Theflieswerekilledand164

preservedin70%ethanolandstoredatroomtemperatureuntilDNAextraction.165

CollectedspecimenswereidentifiedtomorphospeciesfollowingHennig(1976)and166

unpublishedkeys(V.Michelsen).Allidentificationswereconfirmedbyanexpert(V.167

Michelsen,NaturalHistoryMuseumofDenmark,Copenhagen),asthetaxonomical168

revisionofthegenusisnotyetpublished.169

2.3SequencingmitochondrialregionsandRADmarkers170

DNAwasextractedfrominsectlegsusingaDNeasyBloodandTissueKit(Qiagen,171

Hilden,Germany),followingthemanufacturer’sinstructions.Weamplifiedthree172

mitochondrialregions:COI,COII,andtheultra-variableD-loop(control)region.We173

followedEspíndolaetal.(2012)forsequencingoftheCOIandCOIIregions.ForD-loop174

weusedprimersTM-N-193andSR-J-14612asdesxcribedinSimonetal.(1994)as175

describedbyEspíndolaetal.(2012)withthefollowingmodificationofthePCR176

program:5minat95°C,followedby35cyclesof1minat95°C,1minofannealing177

at55°Cand2minofelongationat60°C,and5minoffinalelongationat60°C.PCR178

productsweresequencedatMacrogenInc.(SouthKorea)andFasterisSA(Switzerland).179

ChromatogramswerevisuallycorrectedonChromasPro1.41(TechnelysiumPty.Ltd.).180

AlignmentwasperformedusingMUSCLEalgorithm(Edgar2004)inGeneious10.1.3181

9

(Biomatters,Auckland,NewZeland)andgapswithmorethan50%missingdatainthe182

D-loopregionwereremoved.Additionally,adatasetwithD-loopremovedwas183

analyzed.DoubledigestRAD(ddRAD)librarieswerepreparedaccordingtoMastrettaet184

al.(2014),amodifiedprotocolofPetersonetal.(2012),withoutperformingthesize-185

selectionofDNAfragments,andotherminormodifications(seeSupporting186

Information).TheenzymesusedforDNAdigestionwereSbfIandMseI.Librarieswere187

sequencedattheLausanneGenomicTechnologiesFacility(Switzerland)onthreelanes188

oftheHiSeq2500instrument(Illumina,SanDiego,USA)usinga2x100bppaired-end189

readsprotocol.ForRAD-sequencingweintroducedtechnicalreplicatesforoptimizing190

denovoassemblyandcontrolsfortheeffectsofsequencingerrorsandalleledropouton191

thefinalresults(Mastretta-Yanesetal.2015;sampleswith“REPL”suffixinTableS1),192

andDNAextractionreplicatesfromtheflythoracicmuscle(inordertocontrolforflies’193

bodycontaminationwithpollen;sampleswith“MUS”suffixinTableS1).194

2.4RAD-seqlociassembly195

TwoimportantconsiderationsfordenovoRADlociassemblyaretheparametersfor196

clusteringorthologousloci,whilefilteringoutparalogs(Eaton2014;Mastrettaetal.197

2015).Ifthesequencesimilarityrequiredtoconsidersequenceasorthologsissettoo198

high,realheterozygousalleleswillbesplitintomorethanonecluster,therefore199

creatingfalsehomozygousloci(Harveyetal.2015).Ontheotherhand,ifthesimilarity200

issettoolow,thiswillresultinparalogoussequencesbeingclusteredtogether.Several201

methodswereproposedforfilteringoutsuchsequencesfromthefinaldataset,202

includingploidyfiltering(removingclustersthathavemorethantwosequencesper203

individual)andfilteringouthighlyvariableloci(Eaton2014;Ilutetal.2014).Asthere204

10

arenogeneralguidelinesforfine-tuningtheparametersmentionedabove,we205

empiricallytestedhowthedifferentclusteringparametersaffectedthefinaldataset.206

Finally,wechosethedatasetwithclusteringparameterthatmaximizedthelocioverlap207

betweenpairsoftechnicalreplicates(seebelow).Locioverlapamongsamplesandpairs208

oftechnicalreplicateswerecalculatedusingtheRADami1.0libraryinR(Hippetal.209

2014).210

ReaddemultiplexinganddenovoassemblyofRADlociwasperformedusingthepyRAD211

3.0.1package(Eaton2014),basedonanalignment-clusteringalgorithm.Thisapproach212

allowsforindelvariationamongmoredivergedspecimens.Moreover,italsoallowsfor213

lowersimilarityamongtheclusteredreads,makingitwell-suitedforphylogenetic-scale214

analyses.First,thereadsweredemultiplexedaccordingtothein-line6-nucleotides215

barcodepresentatthebeginningofeachsequencedfragment,whileallowingforone216

mismatch.Onlyreadswiththerestrictionsitepresentwereretainedforfurther217

analyses.AllnucleotideswithPhredqualityscorelowerthan20wereconvertedto218

unknownbasesandreadswithmorethanfourunknownsiteswereremovedfromthe219

dataset.Readswerethenclusteredwithinandbetweenindividuals,withaminimum220

numberofsixreadstoformaclusterandsequencesimilarityof75%,80%,85%,90%,221

and95%.Possibleclusteredparalogsorrepetitivesequenceswereremovedbyfiltering222

outthelocithathadmorethanfivevariablepositionsperlocusormorethan10shared223

polymorphicsitesinalocusamongindividuals,andthelociforwhichmorethantwo224

alleleswerepresentperindividual.Finally,datasetswereproducedbyretainingtheloci225

presentinaminimumof10,20,and100individualsandcomparedforthetotalnumber226

ofloci,proportionofmissingdata,locioverlapamongreplicates,andthemeannumber227

ofindividualsperlocus.228

11

2.5Phylogeneticanalyses229

WeperformedMaximum-likelihood(ML)analysesusingRAxML(Stamatakis2014)230

withrapidbootstrapanalysesandextendedmajority-ruleconsensustreeautomatic231

bootstrapstoppingcriterion,followingsearchforthebest-scoringMLtree.The232

mitochondrialregionswerepartitionedusingthePartitionFinder2.1.1software233

(Lanefaretal.2016).AnalyseswereperformedintheRAxML8.2.4,forthe234

mitochondrialdataset.ForthedatasetconsistingofCOI,COII,andD-loopregionsthe235

GTR+G+Imodelwithallthreenucleotidepositionsoncodinggenesconsideredas236

separatepartitionsandD-loopasafourthpartition.ForthedatasetconsistingofCOI237

andCOIItheGTR+Gmodelwiththefirsttwonucleotidepositionsconsideredasafirst238

andthirdnucleotidepositionconsideredasasecondpartition.AnalysesfortheRAD239

datasetwereperformedusingtheGTRCATmodelintheRAxML8.2.10versiononthe240

CIPREScluster(SanDiegoCA,USA;Milleretal.2010).FortheRAD-baseddataset,241

replicatedsampleswereretainedinthephylogeneticanalysesandtheconcatenated242

matrixwasconsideredasasinglepartition.Additionally,MLanalysesoftheRAD243

datasetswithotherclusteringparameterswereperformedinordertoevaluatehowthis244

parameteraffectsthetreetopology.ThetreeswererootedwithC.rotundiventrisasan245

outgroup,asidentifiedpreviously(Desprésetal.2002;Espíndolaetal.2012).246

Toaccountfortheeffectsofincongruenceamongnuclearlociontheinferred247

phylogenies—forinstanceresultingfromincompletelineagesorting,weappliedthe248

methodofGorietal.(2016)implementedinthetreeClpackage(http://git.io/treeCl).249

BecausethemajorityofRADlocihadsparsecoverageovertheindividuals,wekeptonly250

locipresentinmorethan100individualsforthispartofanalysis.TheML251

phylogenenieswerefirstcalculatedforeverylocususingtheGTR+Gmodelas252

12

implementedinRAxML8.1.11(Stamatakis2014).Then,pairwisegeodesicdistances253

betweenalltheresultingsingle-locusphylogeniesweremeasured,andthetreeswere254

groupedbasedonthedistancematrixusingspectralclustering(aprotocolhereafter255

referredtoasbinning).Thenumberofbinswasestimatedusingthenonparametric256

bootstrappingstoppingcriterion.Supportforeachbranchineachtopologywas257

calculatedusingaBayesinPhyML(Anisimovaetal.2011).Wealsoanalyzedthelog-258

likelihoodimprovementwhenanalyzingthedatawithn+1splitsvs.nsplits,compared259

tothenullexpectation(i.e.randomlociclustering).260

Additionally,weappliedacoalescent-basedinferencemethodusingSVDquartets261

(ChifmanandKubatko2014)asimplementedinPAUP*v.4a150andv.4a151(Swofford262

2002).Thismethodinfersthetopologyamongrandomlysampledquartetsusinga263

coalescentmodel,andassemblestherandomlysampledquartetsusingaquartet264

amalgamationmethod.Breakingthesequenceintoquartetsmakestheanalysisoflarge265

numbersoflocifeasible.Werandomlysampledthemaximumofallpossiblequartets266

(i.e.48,603,900quartets=200taxa)withthemultispeciescoalescentoptionand1,000267

bootstrapreplicates.ThequartetsweresummarizedwiththeQFM(Reazetal.2014)268

quartetamalgamationprogramasimplementedinPAUP*.Phylogenetictreeswere269

visualizedusingtheape3.2Rpackage(Paradisetal.2004).270

2.6Populationstructure271

Weinferredpopulationstructureusingtheadmixturemodelimplementedin272

STRUCTURE2.3.4(Pritchardetal.2000),withoutpriorpopulationassignmentandwith273

allelefrequenciescorrelatedamongpopulations.ThesoftwareusesaBayesian274

frameworktoestimatethelikelihoodofthedatagivenanumberofaprioridefinedK275

13

populationclusters,outputtingthelikelihoodofeachsampletobelongtoeachpossible276

cluster.Thisanalysiswasperformedafterremovingtechnicalreplicatesfromthe277

dataset,retainingonlythelocipresentinaminimumof20individuals,andselecting278

onerandomsingleSNPfromeachlocus.AnalyseswererunforKvaluesranging279

between1and8,withaburn-inof200Kcycles,followedby1Mcyclesofsampling,with280

3replicatesforeachKvalue.TheoptimalKvaluewasidentifiedfollowingEvannoetal.281

(2005),asimplementedinSTRUCTUREHARVESTER(Earl&vonHoldt2012).To282

accountforthephylogeneticcomponentinthemissingalleles,weranSTRUCTUREwith283

therecessiveallelesmodel,withmissingdatacodedasrecessive.284

3.Results285

3.1Chiastochetasampling286

Weanalyzedatotalof272ChiastochetaspecimenssampledfromtheentireEuropean287

rangeofthegenus(seeTable1andFig.S1formapsofthesampledspecimens).Most288

speciesdisplayedabroadspatialdistribution.Uptosixspeciescouldbefoundinone289

singlelocalityduringasinglevisit(Table1,mean=2.7speciesperlocality,SD=1.5),290

confirmingthesympatricnatureofthespeciesandtheexistingopportunitiesfor291

hybridization.292

3.2SequencingandRADlociassemblyresults293

Afterinitialscreening,21sampleswereremovedfromthefinaldatasetbecauseof294

insufficientcoverageortechnicalerrors.Wesuccessfullyanalyzed260specimensfor295

themitochondrialdataset(255forCOI,204forCOII,141forthefirst,and152forthe296

secondfragmentoftheD-loop),and263fortheRADdataset,while251sampleswere297

14

sharedbetweenthedatasets(Table1).FortheRADdataset,wealsosequenced22298

technicalreplicates(sampleswith“REPL”suffixinTableS1),and11DNAextraction299

replicatesfromtheflymusclevs.extractionsfromlegs(sampleswith“MUS”suffixin300

TableS1).301

Sequencingofthemitochondrialregionsyielded1132nucleotidepositionsforthe302

COI+COIIdataset(ofwhich120werevariable)and2003fortheCOI+COII+D-loop303

dataset(ofwhich334werevariable),afteralignmentandgapfiltering.Threerunsof304

RADsequencingoutput552’425’482of2x100bpreads,fromwhich340’598’636305

(62%)passedtherestrictionsiteandbarcodequalityfilters(TableS1).306

Aftercomparingthenumberofloci,coverage,andoverlapoflociamongreplicatesin307

theobtaineddatasets(Fig.S2),wechosethedatasetwithaminimumof75%sequence308

similarityrequiredforthesequencestoclusterinalocusandaminimumof20309

individualsperlocusforthemainanalyses.Thisdatasetcontained1724lociafter310

filteringandparalogremoval,with82’782variablesites.Theproportionofmissingdata311

inthedatasetwas0.84,withastrongphylogeneticcomponentinthedistributionof312

missingloci(Fig.S3a).AftersamplingoneSNPperlocusfortheSTRUCTUREanalysis,313

weobtained1669SNPs,ofwhich159werebi-allelic.314

ForthedatasetusedforassessinglociincongruenceintheRAD-seqbasedphylogeny315

(seebelow),wefocusedonlocipresentinatleast100individuals.Thisresultedina316

matrixof176loci(among1724overallnumberoflociidentified;i.e.,10.2%)with317

missingdatashowingmuchlessphylogeneticstructuring(Fig.S3b).318

15

3.3Mitochondrialandnuclear-dataphylogenies319

ThemitochondrialphylogenyontheCOI+COII_D-loopdataset(Fig.1aandFig.S4a)320

failedtoresolvefourofthecladesidentifiedbasedontheRAD-seqdata(seebelow),but321

retrievedwell-supportedmonophyleticgroupforC.rotundiventrisand,tothelesser322

extentforC.dentiferaandC.inermella,asbothofthelatterhadtwospecimensplaced323

outsidetheirclades.C.inermella,C.setifera,andC.trolliiformedonecladewiththe324

speciesextensivelyinterdispersedandacladecontainingmostlyC.macropyganested325

within.AstheanalysisbasedbasedonthereducedCOI+COIIdatasetrecoveredasimilar326

pattern,exceptplacingC.lophotaassistertoC.macropyga,werefertotheresultsofthe327

largerdatasetintherestofthepaper.MostoftheC.lophotasamplesalsoformedone328

cladewithlowersupportvalues.Therelationshipsamongsamplesfromtheremaining329

fourmorphospeciesremainedunresolved,withoutclearsupportforthe330

morphologicallydescribedspecies.331

IncontrasttotheMLmtDNAphylogeny,bothMLandSVDquartetsanalysisoftheRAD332

analysis(Fig.1b,candFig.S4b,c)confirmedmonophylyofthesevenmorphologically333

definedtaxa.RAxMLanalysisrevealedrelativelyhighbootstrapsupports(>90%)for334

alloftheinterspecificrelationships,exceptthesplitbetweenC.setiferaandtheclade(C.335

lophota,(C.macropyga,(C.dentifera,C.trollii)))withbootstrapsupport>80%.Thesplit336

ofC.rotundiventrisintotwoputativevicariantclades,informallyproposedbyPellmyr337

(1992)—northernC.abruptriventrisandsouthernC.rotundiventris,wasnotrecovered.338

SVDquartetsanalysisalsoconfirmedmonophylyofthespecies,butonlythesplit339

betweenC.dentiferaandC.trolliihadabootstrapsupport>90%;theclade(C.340

macropyga,(C.dentifera,C.trollii))hadbootstrapsupport>80%;thesetwocladeswere341

theonlyonessupportedbybothSVDquartetsandtheRAxMLanalyses(Fig.1c).342

16

Moreover,SVDquartetsrevealedtwowell-supportedcladeswithinC.rotundiventris.343

Thesehoweverdonotshowanypatternofvicarianceandoftenoccurtogetherina344

singlepopulation,thusmostlikelydonotcorrespondtothetwovicariantspeciesofC.345

abruptiventrisandC.rotundiventrisasdiscussedbyPellmyr(1992).Thetechnical346

replicateswereconsistentintheplacementofthesamplewithintheproperclade,and347

mostreplicateswereplacedassistercladeswithbothmethods(Fig.S4b,c).348

3.4IncongruenceamongtheRAD-sequencingloci349

TreeClanalysisidentified,inthemostconservativeinterpretation,atleastfourclusters350

ofloci,asthelargestlikelihoodimprovementwasobtainedwhenincreasingthenumber351

ofbinsfromthreetofour(Fig.S6).Thebinsizeswereof29,42,47,and58loci,352

thereforetheidentifiedgroupswerenotsimplyconsistingofafewoutliers.Thetrees353

inferredforthefourbinsconfirmedthemonophylyoftheanalysedspeciestoalarge354

extent,althoughfewindividualsappearedoutsidetheirexpectedclades.Thelargest355

departurefrommonophylywasobservedforC.lophotainthesmallesttreeconsistingof356

29loci(Fig.2).Thetreesinferredforeachclusterhadbranchsupportsforinterspecific357

nodeslargerthan95%,anddifferedsubstantiallyintermsoftopologyandbranch358

lengths.Onlyonetree,withthelargestnumberofloci(i.e.,58)supportedtheonlyclade359

thatwassupportedbybothRAMLandSVDquartetsanalysis(C.macropyga,(C.360

dentifera,C.trollii)).Exceptthat,theinterspecificrelationshipsretrievedwitheachof361

thetreeClbinsweredifferentthanwiththeconcatenatedRAxMLanalysisand362

SVDquartetsanalysis(Fig.1b,c).363

17

3.5Structureanalysis364

Wefoundlowlevelsofcontemporaryintrogression,asshownbySTRUCTUREanalysis.365

ThemostlikelyKnumberofSTRUCTUREgroupswasconsistentwiththenumberof366

morphologicalspecies(7),andallsampleswereassignedtotheir‘correct’367

morphospecies(Fig.1d).Alsoforlowernumbersofclusters,wedidnotobserve368

signaturesofintrogression(Fig.S5). 369

4.Discussion370

4.1UtilityandlimitsofRAD-sequencingforresolvingphylogenyof371

a„difficult”genus372

RAD-sequencingsuccessfullydiscriminatedallformallydescribedEuropean373

Chiastochetaspecies.Therobustspeciesdelineationisstrikingwhencomparedto374

mtDNA-basedtreesthatfailedtosupportmonophylyofC.inermella,C.macropyga,C.375

setifera,andC.trollii(Fig.1aandFig.S4a;seealso:Desprésetal.2002;Espíndolaetal.376

2012).Theabilitytorecoverpreviouslydefinedmorphologicalspeciesinourdataset,377

whateveranalysismethodused(i.e.,maximum-likelihoodphylogeneticreconstruction378

usingaconcatenatedmatrixwithRAxML,coalescence-basedphylogeneticinference379

withSVDquartets,orpopulation-geneticsclusteringwithSTRUCTURE),supportsthe380

resultsofaprevioussimulationstudybyHovmölleretal.(2013),thathighamountsof381

missingdata,typicalforRAD-baseddatasets,shouldnotinterferewithclade(orcluster)382

identification.Recently,similarconclusionsweredrawnbyEatonetal.(2016)383

concerningtheSVDquartetsmethod.384

18

Incontrast,noconsensuscouldbereachedinretrievinginter-specificrelationships.385

WhereasRAxMLidentifiedrelationshipswithhighbootstrapsupportinfourofthefive386

possibleinterspecificrelationships,onlytwoofthemwerealsosupportedbythe387

SVDquartetsanalysis(Fig.1b,c).Incongruenceinthephylogeneticsignalsassociated388

withdifferentsetsoflocicouldexplainthedifficultyinresolvingtheseinterspecific389

relationships.WhenperforminglocibinningusingtreeCL(Gorietal.2016),wefound390

outthatdifferentsubsetsofloci(inourcase,theoptimalnumberofbinswasequalto391

four)produceddifferenttopologies,whilestillbeinglargelycongruentinthesample392

assignmentintospecies(Fig.2).393

Shortinterspecificbranchesintheresolvedphylogeniesconfirmtheconclusionsof394

Espíndolaetal.(2012)thatmostofthespeciesfromtheChiastochetagenusunderwent395

arecent(lessthan1.6Mya),rapidradiation.Theseresultshighlightthefactthatinsuch396

casesitmaybeimpossibletoretrievesomeofthephylogeneticrelationshipsamongthe397

taxaasfullybifurcatingtree,becausegenetreesmaydepictdifferentevolutionary398

historiesduetoincompletelineagesorting(Aviseetal.2008;Maddison1997).Thisisa399

limitationsharedwithclassicalmarkers(Walshetal.1999)andotherNGSapproaches400

(seebelow),pointingtoapossibleconstitutivelimitationinresolvingrapidradiations.401

Inrapidlydivergingtaxa,eventhelargenumberofnuclearmarkers,whilebeingmore402

successfulhereinrecoveringspeciesboundariesthanmitochondrialmarkersmaynot403

beinformative-enoughtoretrieveallinterspecificevolutionaryrelationships.404

TheextenttowhichtheabovelimitationistheresultoftechnicalconstraintsofRAD405

datasetsoratruebiologicallimitationremainstobeinvestigated.RAD-seqtargets406

random,mostlyneutralpartsofthegenome.Thisresultsinhighnumberoflineage-407

specificmutationsthatbearastrongsignaltodelineatespeciesorpopulations–within408

19

thesefast-evolvingpartsofthegenome,evenvaryingallelecopy-numbers(i.e.recent409

paralogs)canappearaspopulation-specific(Mastretta-Yanesetal.2014).The410

downsideishowever,missingdataincreasesrapidlywithevolutionarydistanceasa411

resultofthelossofrestrictionsites(Cariouetal.2013; Chattopadhyayetal.2014;412

DaCosta&Sorenson2016;Gautieretal.2013;Rubinetal.2012;Wagneretal.2013).413

Forinstance,Leachéetal.(2015)founddifferencesbetweenphylogeniesobtained414

usingRAD-seqvs.targetenrichmenttechniques,whereasotherstudieshaveshownthe415

agreementamongdatatypes(Mantheyetal.2016).Thelattertechniquesrelyon416

captureofapredefined(Fairclothetal.2014;McCormacketal.2012)orrandom417

(Suchanetal.2016,Schmidetal.2017)subsetofloci.Bynotrelyingonthepresenceof418

restrictionsites,andthushavinglessmissingdata,enrichmenttechniquesmaybe419

bettersuitedforbroaderphylogeneticscales.420

Nevertheless,ithasbeenshownthatevenwithhundredsofconservedloci,known421

substitutionmodelsandseveralindividualsperspecies,treeswithshortbranchesare422

difficulttoresolve,andMLanalysesbasedonconcatenatedsequencesmayprovidehigh423

bootstrapvaluesdespiteincorrectlyresolvedtopologies(Giarla&Esselstyn2015;424

Kubatko&Degnan2007;butseeGatesy&Springer2013;Springer&Gatesy2016;Roch425

&Warnow2015).Thisisexemplifiedbyourstudy,inwhichusingallRADloci,we426

obtainedaMLphylogenywithhighlysupportedinterspecificnodes,whereas427

coalescence-basedphylogeneticinferencedidnotshowstrongsupportsformostofthe428

interspecificrelationships.Ourexplorationofexplanationsforsuchadiscrepancyusing429

thelocibinningapproachshowedsupportforatleastfourdifferentunderlyinggene430

treetopologies.Intheseanalyses,wereducedthedatasettoanon-randomsetofloci431

whenfilteringforhighlocicoverageamongsamples.Theretainedloci,presentinat432

20

least100analyzedindividuals,andwithlessphylogenetically-structuredmissingdata433

(seeFig.S3b),shouldbecharacterizedbylowermutationratesorbeingunder434

stabilizingselection(Huang&Knowles2014).Usingbinning,thebestfittothedatawas435

notobtainedwithasinglebinoflocibutwithfour.Wecouldthereforenotidentifyone436

singleevolutionaryhistoryoftheChiastochetagenus,butratherequally-supported437

genetreestopologies.Importantly,thesedifferenttopologiescannotbeattributedtoa438

fewoutlierloci,astheirdistributionwasrelativelyevenacrosstheclusters(29,58,42439

and47loci;Fig.S6),incongruenceamongthesesetspossiblyimpactingmaximum-440

likelihoodphylogeneticreconstructionusingaconcatenatedmatrixandcoalescence-441

basedphylogeneticinference.Wehavealsoconfirmedthatinsuchcases,MLmethods442

provideelevatedbootstrapsupportvalues,andthatlowerbootstrapsupportvalues443

resultingfromcoalescence-basedmethodsmaybetterreflectthebiologicaluncertainty444

ofinterspecificrelationships.445

4.2MitonucleardiscordanceinthephylogenyofChiastocheta446

WhileourRAD-sequencedatasetdelineatedsevenclades,withfullagreementwiththe447

morphologicalassignments,mitochondrialdatafailedtosupportspeciesmonophyly,448

exceptforC.rotundiventrisand,toalesserextent,C.lophotaandC.dentifera.Theother449

remainingspecies:C.inermella,C.setiferaandC.trolliiformedalargecladewiththe450

speciesextensivelyinterdispersedandwiththecladeconsistingmostlyofC.macropyga451

nestedwithin(Fig.1a).Despite,onaverage,mitochondrialmarkersshouldbemore452

suitedforcapturingrelationshipsamongrecentlydivergedlineages,duetoaneffective453

populationsizefourtimeslessthanthatofnucleargenes(assumingneutralprocesses,454

equalsexratios,andunbiasedmatingsystems),andthusshortercoalescencetimes455

21

(Zink&Barrowclough2008),analyzingalargedatasetofnuclearmarkersprovided456

morepowertodiscriminatethespeciesinourcase.457

Mitonucleardiscordancepatternscanbeexplainedeitherbythedifferentbiological458

propertiesofmitochondrialDNA(vegetativesegregation,uniparentalinheritance,459

intracellularselection,andreducedrecombination;Birky2001)ordifferencesinthe460

evolutionaryhistoriesofnuclearandmitochondrialmarkers[e.g.,directselectionon461

themitochondrialgenes(Ballardetal.2007;Ballard&Pichaud2014;Boratyńskietal.462

2014;Dowlingetal.2008),incompletelineagesorting,historicalorongoinggeneflow463

amongspecies,orhybridspeciation].Indeed,ithasbeenshownbeforethatrelyingona464

single,non-recombiningmtDNAmoleculemayprovideamisleadingaccountofa465

specieshistory(e.g.,Ballard&Whitlock2004;Govindarajuluetal.2015;Phillipsetal.466

2013;Seehausenetal.2003;andreviewsbyFunk&Omland2003;Rubinoff&Holland467

2005).Whileinvestigatingthereasonsforthemito-nucleardiscordancewasnotwithin468

thescopeofthispaper,wecouldrejectthehypothesisofacontemporarygeneflowor469

hybridoriginofthetaxaasresponsibleforthispattern.Wedidnotdetectsignatureofa470

geneticmosaicinthe—mostly—nuclearRADdata,whichwouldbeexpectedinthecase471

ofhybridorigin(Ballard2000;Brelsfordetal.2011;Mallet2007;Pollardetal.2006).472

UsingRAD-sequencingdata,theassignmentofsamplesintospecieswasconcordant473

withmorphology(Fig.1b,candS4b,c)andwedidnotdetectsignificantlevelsof474

contemporarygeneflowusingpopulationgenetics-basedapproaches(Fig.1d),despite475

apparentopportunitiesforhybridization.MostofChiastochetaoccurinsympatry(Fig.476

S1),theyalsohaveverysimilarbiologies,reproducingandspendingmostoftheirtime477

onorinsideflowersofTrolliuseuropaeus(Pellmyr1989;Suchanetal.2015).Althougha478

temporalsequenceinovipositionhasbeenobserved(Després&Jaeger1999;479

22

Johannesen&Loeschcke1996;Pellmyr1989),mostspeciesco-occurtemporally.480

Despitetheseecologicalsimilaritiesandtherelativelyyoungageofthegenus(mostof481

thecladesemerginglessthan1.6Ma;Espindolaetal.2012),alackofnuclearevidence482

forhybridizationindicatesstrongcontemporaryreproductivebarriersamongthe483

species.484

5.Conclusions485

ThisstudydemonstrateshowacombinationofRAD-seqandmtDNAdatacanprovide486

insightsintophylogeniesofgenerathatarepoorlyresolvedusingmitochondrial487

markersaloneandrevealcomplexpictureofmitonucleardiscordance.Italso488

underlinesthelimitsofRAD-seq-basedphylogeniesincaseofrapidradiations.Our489

resultsshowthatascenarioofrapidradiationcanaffectmanylociacrossthegenome,490

leadingtodiscordantgenetrees,evenwhenusingmethodscontrollingforincomplete491

lineagesorting.Thismaypointtoaninherentlimitationofusingmolecularmarkersto492

resolverapidradiations,atleastatsomeoftheinter-specificrelationships,andsuggests493

thatthislimitationisnotnecessarilyduetotechnicalissues(e.g.lownumberofshared494

markers).495

Addingtothebodyofexamplesofmito-nucleardiscordance(reviewedinToews&496

Brelsford2012),ourstudywarnsagainstrelyingsolelyonmitochondrialmarkers(e.g.,497

COIbarcoding;Herbertetal.2003)forspeciesdelimitation,especiallywhentheyshow498

incongruencewithclassicaltaxonomy.Inthecasepresentedhere,mitochondrial499

markerssuggestedpoly-orparaphylyformostspecies,andproposedtheneedto500

reviewthetaxonomyofthegenus(Espíndolaetal.2012).Whentackledfromthe501

genomicpointofview,thegeneticsupportofspeciesstatusforthesesevenentitieswas502

23

confirmed.Finally,weprovideanexampleofhowMLphylogeniesbasedonlarge503

concatenateddatasetscanprovideerroneouslyhighbootstrapsupportsforincorrector504

uncertaintopologies(Giarla&Esselstyn2015;Kubatko&Degnan2007).505

Acknowledgments506

TheauthorswouldliketothankY.Triponez,N.Magrou,P.Lazarevic,D.Gyurova,R.507

Lavigne,L.Juillerat,N.Villard,andR.ArnouxfortheirhelpduringthefieldworkandV.508

Michelsenforhisgreathelpwithspeciesdetermination.WealsothankJ.OllertonandJ.509

Pannellforinsightfuldiscussionsandcommentsonthemanuscript.Thisworkwas510

supportedbythegrant‘FondsdesDonations’oftheUniversityofNeuchâtel511

(Switzerland)attributedtoAE.TSacknowledgesfundingfromtheRectors’Conference512

oftheSwissUniversities(CRUS)throughthe“ScientificExchangeProgrambetween513

SwitzerlandandthenewEUmemberstates”(SciexNMSgrantno.10.116)andfrom514

SociétéAcadémiqueVaudoise(Switzerland).ThisworkwasfundedbytheSwiss515

NationalScienceFoundation(grants3100A0-116778,PZ00P3-126624,516

PP00P3_144870toN.Alvarez,andgrantPP00P3_150654toCD)andbyagrantfrom517

SwitzerlandthroughtheSwissContributiontotheenlargedEuropeanUnion(Polish-518

SwissResearchProgram,projectno.PSPB-161/2010). 519

24

References520

AndrewsKR,GoodJM,MillerMR,LuikartG,HohenlohePA(2016)Harnessingthe521

powerofRADseqforecologicalandevolutionarygenomics.NatureReviews522

Genetics,17,81-92.523

AnisimovaM,GilM,DufayardJ-F,DessimozC,GascuelO(2011)SurveyofBranch524

SupportMethodsDemonstratesAccuracy,Power,andRobustnessofFast525

Likelihood-basedApproximationSchemes.SystematicBiology,60,685–699.526

AviseJC,RobinsonTJ,KubatkoL(2008)Hemiplasy:anewterminthelexiconof527

phylogenetics.SystematicBiology,57(3),503-507.528

BairdNA,EtterPD,AtwoodTS,CurreyMC,ShiverAL,LewisZA,SelkerEU,CreskoWA,529

JohnsonEA(2008)RapidSNPdiscoveryandgeneticmappingusingsequencedRAD530

markers.PLoSONE,3,e3376.531

BallardJWO(2000)Whenoneisnotenough:introgressionofmitochondrialDNAin532

Drosophila.MolecularBiologyandEvolution,17,1126-1130.533

BallardJWO,MelvinRG,KatewaSD,MaasK(2007)MitochondrialDNAvariationis534

associatedwithmeasurabledifferencesinlife-historytraitsandmitochondrial535

metabolisminDrosophilasimulans.Evolution,61,1735-1747.536

BallardJWO,PichaudN(2014)MitochondrialDNA:morethananevolutionary537

bystander.FunctionalEcology,28,218-231.538

BallardJWO,WhitlockMC(2004)Theincompletenaturalhistoryofmitochondria.539

MolecularEcology,13,729-744.540

25

BirkyCW(2001)Theinheritanceofgenesinmitochondriaandchloroplasts:laws,541

mechanisms,andmodels.AnnualReviewofGenetics,35,125-48.542

BoratyńskiZ,Melo-FerreiraJ,AlvesPC,BertoS,KoskelaE,PentikäinenOT,MappesT543

(2014)Molecularandecologicalsignsofmitochondrialadaptation:consequences544

forintrogression.Heredity,113,277-286.545

BrelsfordA,MiláB,IrwinDE(2011)HybridoriginofAudubon’swarbler.Molecular546

Ecology,20,2380-2389.547

CariouM,DuretL,CharlatS(2013)IsRAD-seqsuitableforphylogeneticinference?An548

insilicoassessmentandoptimization.EcologyandEvolution,3,846-852.549

ChattopadhyayB,GargKM,RamakrishnanU(2014)Effectofdiversityandmissingdata550

ongeneticassignmentwithRAD-Seqmarkers.BMCresearchnotes,7,841.551

ChifmanJ,KubatkoL(2014)QuartetinferencefromSNPdataunderthecoalescent552

model.Bioinformatics,30:3317-3324.553

CollinJE(1954)ThegenusChiastochetaPokorny(Diptera:Anthomyiidae).Proceedings554

oftheRoyalEntomologicalSocietyLondon(B),23,95-102.555

ComboschDJ,VollmerSV(2015)Trans-PacificRAD-Seqpopulationgenomicsconfirms556

introgressivehybridizationinEasternPacificPocilloporacorals.Molecular557

PhylogeneticsandEvolution,88,154-162.558

CruaudA,GautierM,GalanM,FoucaudJ,SaunéL,GensonG,RasplusJY(2014)559

EmpiricalassessmentofRADsequencingforinterspecificphylogeny.Molecular560

BiologyandEvolution,31,1272-1274.561

26

DaCostaJM,SorensonMD(2016)ddRAD-seqphylogeneticsbasedonnucleotide,indel,562

andpresence–absencepolymorphisms:Analysesoftwoaviangenerawith563

contrastinghistories.MolecularPhylogeneticsandEvolution,94,122-135.564

DaveyJL,BlaxterMW(2010)RADSeq:Next-generationpopulationgenetics.Briefingsin565

FunctionalGenomics,9,416-423.566

DaveyJW,HohenlohePA,EtterPD,BooneJQ,CatchenJM,BlaxterML(2011)Genome-567

widegeneticmarkerdiscoveryandgenotypingusingnext-generationsequencing.568

NatureReviewsGenetics,12,499-510.569

DeFilippisVR,MooreWS(2000)Resolutionofphylogeneticrelationshipsamong570

recentlyevolvedspeciesasafunctionofamountofDNAsequence:Anempirical571

studybasedonwoodpeckers(Aves:Picidae).MolecularPhylogeneticsand572

Evolution,16,143-160.573

Degnan,J.H.,Rosenberg,N.A.,2009.Genetreediscordance,phylogeneticinferenceand574

themultispeciescoalescent.TrendsinEcologyandEvolution24,332–340.575

DesprésL,JaegerN(1999)Evolutionofovipositionstrategiesandspeciationinthe576

globeflowerfliesChiastochetaspp.(Anthomyiidae).JournalofEvolutionaryBiology,577

12,822-831.578

DesprésL,PettexE,PlaisanceV,PompanonF(2002)Speciationintheglobeflowerfly579

Chiastochetaspp.(Diptera:Anthomyiidae)inrelationtohostplantspecies,580

biogeography,andmorphology.MolecularPhylogeneticsandEvolution,22,258-268.581

DowlingDK,FribergU,LindellJ(2008)Evolutionaryimplicationsofnon-neutral582

mitochondrialgeneticvariation.TrendsinEcologyandEvolution,23,546-554.583

27

EarlDA,vonHoldtBM(2012)STRUCTUREHARVESTER:awebsiteandprogramfor584

visualizingSTRUCTUREoutputandimplementingtheEvannomethod.Conservation585

GeneticsResources,4,359-361.586

EatonDAR(2014)PyRAD:assemblyofdenovoRADseqlociforphylogeneticanalyses.587

Bioinformatics,30,1844-1849.588

EatonDAR,ReeRH(2013)Inferringphylogenyandintrogressionusinggenomic589

RADseqdata:Anexamplefromfloweringplants(Pedicularis:Orobanchaceae).590

SystematicBiology,62,689-706.591

Eaton,D.A.,Spriggs,E.L.,Park,B.,&Donoghue,M.J.(2016).MisconceptionsonMissing592

DatainRAD-seqPhylogeneticswithaDeep-scaleExamplefromFloweringPlants.593

SystematicBiology,syw092.594

EdgarRC(2004)MUSCLE:multiplesequencealignmentwithhighaccuracyandhigh595

throughput.NucleicAcidsResearch32,1792-1797.596

EscuderoM,EatonDA,HahnM,HippAL(2014)Genotyping-by-sequencingasatoolto597

inferphylogenyandancestralhybridization:AcasestudyinCarex(Cyperaceae).598

MolecularPhylogeneticsandEvolution,79,359-367.599

EspíndolaA,BuerkiS,AlvarezN(2012)Ecologicalandhistoricaldriversof600

diversificationintheflygenusChiastochetaPokorny.MolecularPhylogeneticsand601

Evolution,63,466-474.602

EvannoG,RegnautS,GoudetJ(2005)Detectingthenumberofclustersofindividuals603

usingthesoftwareSTRUCTURE:asimulationstudy.MolecularEcology,14,2611-604

2620.605

28

FairclothBC,BranstetterMG,WhiteND,BradySG(2014)Targetenrichmentof606

ultraconservedelementsfromarthropodsprovidesagenomicperspectiveon607

relationshipsamongHymenoptera.MolecularEcologyResources,15,489-501608

FunkDJ,OmlandKE(2003)Species-levelparaphylyandpolyphyly:frequency,causes,609

andconsequences,withinsightsfromanimalmitochondrialDNA.AnnualReviewof610

Ecology,EvolutionandSystematics,34,397-423.611

Gatesy,J,SpringerMS(2013)Concatenationversuscoalescenceversus612

“concatalescence”.ProceedingsoftheNationalAcademyofSciences,110,E1179-613

E1179.614

GautierM,GharbiK,CezardT,FoucaudJ,KerdelhuéC,PudloP,CornuetJ-M,EstoupA615

(2013)TheeffectofRADalleledropoutontheestimationofgeneticvariation616

withinandbetweenpopulations.MolecularEcology,22,3165-3178.617

GiarlaTC,EsselstynJA(2015)Thechallengesofresolvingarapid,recentradiation:618

Empiricalandsimulatedphylogenomicsofphilippineshrews.SystematicBiology,619

64,727-740.620

GoriK,SuchanT,AlvarezN,GoldmanN,DessimozC(2016)Clusteringgenesofcommon621

evolutionaryhistory.MolecularBiologyandEvolution,33,1590–1605622

GovindarajuluR,ParksM,TennessenJA,ListonA,AshmanTL(2015)Comparisonof623

nuclear,plastid,andmitochondrialphylogeniesandtheoriginofwildoctoploid624

strawberryspecies.AmericanJournalofBotany,102,544-554.625

29

HarveyMG,JudyCD,SeeholzerGF,MaleyJM,GravesGR,BrumfieldRT(2015)Similarity626

thresholdsusedinDNAsequenceassemblyfromshortreadscanreducethe627

comparabilityofpopulationhistoriesacrossspecies.PeerJ,3,e895.628

HarveyMG,SmithBT,GlennTC,FairclothBC,BrumfieldRT(2016)Sequencecapture629

versusrestrictionsiteassociatedDNAsequencingforshallowsystematics.630

SystematicBiology,65,910-924.631

HebertPDN,CywinskaA,BallSL,deWaardJR(2003)Biologicalidentificationsthrough632

DNAbarcodes.ProceedingsoftheRoyalSocietyB:BiologicalSciences,270,313-322.633

HennigW(Ed.)(1976)Anthomyiidae.DieFliegenderPalaearktischenRegion.Stuttgart,634

E.Schweizerbart.635

HerreraS,ShankTM(2016)RADsequencingenablesunprecedentedphylogenetic636

resolutionandobjectivespeciesdelimitationinrecalcitrantdivergenttaxa.637

MolecularPhylogeneticsandEvolution,100,70-79.638

HippAL,EatonDAR,Cavender-BaresJ,FitzekE,NipperR,ManosPS(2014)A639

frameworkphylogenyoftheamericanoakcladebasedonsequencedRADdata.640

PLoSONE,9,e93975.641

HovmöllerR,KnowlesLL,KubatkoLS(2013)Effectsofmissingdataonspeciestree642

estimationunderthecoalescent.MolecularPhylogeneticsandEvolution,69,1057-643

1062.644

HuangH,KnowlesLL(2014)Unforeseenconsequencesofexcludingmissingdatafrom645

next-generationsequences:simulationstudyofRADsequences.SystematicBiology,646

doi:10.1093/sysbio/syu046.647

30

IlutDC,NydamML,HareMP(2014)Defininglociinrestriction-basedreduced648

representationgenomicdatafromnonmodelspecies:Sourcesofbiasand649

diagnosticsforoptimalclustering.BioMedResearchInternational,2014,675158.650

JohannesenJ,LoeschckeV(1996)Distribution,abundanceandovipositionpatternsof651

fourcoexistingChiastochetaspecies(Diptera:Anthomyiidae).JournalofAnimal652

Ecology,65,567-576.653

JonesJC,FanS,FranchiniP,SchartlM,MeyerA(2013)Theevolutionaryhistoryof654

Xiphophorusfishandtheirsexuallyselectedsword:agenome-wideapproachusing655

restrictionsite-associatedDNAsequencing.MolecularEcology,22,2986-3001.656

KubatkoLS,DegnanJH(2007)Inconsistencyofphylogeneticestimatesfrom657

concatenateddataundercoalescence.SystematicBiology,56,17-24.658

LanfearR,FrandsenPB,WrightAM,SenfeldT,CalcottB(2017)PartitionFinder2:new659

methodsforselectingpartitionedmodelsofevolutionformolecularand660

morphologicalphylogeneticanalyses.MolecularBiologyandEvolution.34,772-773.661

LeachéAD,ChavezAS,JonesLN,GrummerJA,GottschoAD,LinkemCW(2015)662

Phylogenomicsofphrynosomatidlizards:conflictingsignalsfromsequencecapture663

versusrestrictionsiteassociatedDNAsequencing.GenomeBiologyandEvolution,7,664

706-719.665

LeachéAD,HarrisRB,RannalaB,YangZ(2013)Theinfluenceofgeneflowonspecies666

treeestimation:asimulationstudy.SystematicBiology,63,17-30.667

LowryDB,HobanS,KelleyJL,LotterhosKE,ReedLK,AntolinMF,StorferA(2017)668

BreakingRAD:anevaluationoftheutilityofrestrictionsite-associatedDNA669

31

sequencingforgenomescansofadaptation.MolecularEcologyResources,17,142–670

152.671

MaddisonWP(1997)Genetreesinspeciestrees.SystematicBiology,46,523-536.672

MaddisonWP,KnowlesLL(2006)Inferringphylogenydespiteincompletelineage673

sorting.SystematicBiology,55,21-30.674

MalletJ(2007)Hybridspeciation.Nature,446,279-283.675

MantheyJD,CampilloLC,BurnsKJ,MoyleRG(2016)Comparisonoftarget-captureand676

restriction-siteassociatedDNAsequencingforphylogenomics:atestincardinalid677

tanagers(Aves,Genus:Piranga).Systematicbiology,syw005.678

Mastretta-YanesA,ZamudioS,JorgensenTH,ArrigoN,AlvarezN,PiñeroD,EmersonBC679

(2014)Geneduplication,populationgenomics,andspecies-leveldifferentiation680

withinatropicalmountainshrub.Genomebiologyandevolution,6,2611-2624.681

Mastretta-YanesA,ArrigoN,AlvarezN,JorgensenTH,PiñeroD,EmersonBC(2015)682

Restrictionsite-associatedDNAsequencing,genotypingerrorestimationandde683

novoassemblyoptimizationforpopulationgeneticinference.MolecularEcology684

Resources,15,28-41.685

McCormackJE,FairclothBC,CrawfordNG,GowatyPA,BrumfieldRT,GlennTC(2012)686

Ultraconservedelementsarenovelphylogenomicmarkersthatresolveplacental687

mammalphylogenywhencombinedwithspeciestreeanalysis.GenomeResearch,688

22,746-754.689

32

McKinneyGJ,LarsonWA,SeebLW,SeebJE(2017)RADseqprovidesunprecedented690

insightsintomolecularecologyandevolutionarygenetics:commentonBreaking691

RADbyLowryetal.(2016).MolecularEcologyResources,doi:10.1111/1755-692

0998.12649.693

MichelsenV(1985)ArevisionoftheAnthomyiidae(Diptera)describedbyJ.W.694

Zetterstedt.Steenstrupia,11,37-65695

MillerMA,PfeifferW,SchwartzT(2010)CreatingtheCIPRESScienceGatewayfor696

inferenceoflargephylogenetictrees.In:ProceedingsoftheGatewayComputing697

EnvironmentsWorkshop(GCE),14Nov.2010,NewOrleans,LApp.1-8.698

NadeauNJ,MartinSH,KozakKM,SalazarC,DasmahapatraK,DaveyJW,BaxterSW,699

BlaxterML,MalletJ,JigginsCD(2013)Genome-widepatternsofdivergenceand700

geneflowacrossabutterflyradiation.MolecularEcology,22,814–826.701

PanteE,AbdelkrimJ,ViricelA,GeyD,FranceSC,BoisselierMC,SamadiS(2015)Useof702

RADsequencingfordelimitingspecies.Heredity,114,450-459.703

ParadisE,ClaudeJ,StrimmerK(2004)APE:analysesofphylogeneticsandevolutionin704

Rlanguage.Bioinformatics,20,289-290.705

PellmyrO(1989)Thecostofmutualism–interactionsbetweenTrolliuseuropaeusand706

itspollinatingparasites.Oecologia,78,53-59.707

PellmyrO(1992)Thephylogenyofamutualism:evolutionandcoadaptationbetween708

Trolliusanditsseed-parasiticpollinators.BiologicalJournaloftheLinneanSociety,709

47,337-365.710

33

PetersonBK,WeberJN,KayEH,FisherHS,HoekstraHE(2012)DoubledigestRADseq:711

AninexpensivemethodfordenovoSNPdiscoveryandgenotypinginmodeland712

non-modelspecies.PLoSONE,7,e37135.713

PhillipsMJ,HaoucharD,PrattRC,GibbGC,BunceM(2013)InferringKangaroo714

PhylogenyfromIncongruentNuclearandMitochondrialGenes.PLoSONE,8,715

e57745.716

PollardDA,IyerVN,MosesAM,EisenMB(2006)Widespreaddiscordanceofgenetrees717

withspeciestreeinDrosophila:Evidenceforincompletelineagesorting.PLoS718

Genetics,2,e173.719

PritchardJK,StephensM,DonnellyP(2000)Inferenceofpopulationstructureusing720

multilocusgenotypedata.Genetics,155,945-959.721

ReazR,BayzidMS,RahmanMS(2014)Accuratephylogenetictreereconstructionfrom722

quartets:Aheuristicapproach.PloSOne,9:e104008.723

RochS,WarnowT(2015)Ontherobustnesstogenetreeestimationerror(orlack724

thereof)ofcoalescent-basedspeciestreemethods.SystematicBiology,syv016.725

RokasA,CarrollSB(2005)Moregenesormoretaxa?Therelativecontributionofgene726

numberandtaxonnumbertophylogeneticaccuracy.MolecularBiologyand727

Evolution,22,1337-1344.728

RubinBER,ReeRH,MoreauCS(2012)InferringphylogeniesfromRADsequencedata.729

PLoSONE,7,e33394.730

34

RubinoffD,HollandBS(2005)Betweentwoextremes:mitochondrialDNAisneitherthe731

panaceanorthenemesisofphylogeneticandtaxonomicinference.Systematic732

Biology,54,952-961.733

SchmidS,GenevestR,GobetE,SuchanT,SperisenC,TinnerW,AlvarezN(2017)734

HyRAD-X,aversatilemethodcombiningexomecaptureandRADsequencingto735

extractgenomicinformationfromancientDNA.MethodsinEcologyandEvolution,736

doi:10.1111/2041-210X.12785737

SeehausenO,KoetsierE,SchneiderMV,ChapmanLJ,ChapmanCA,KnightME,Turner738

GF,vanAlphenJJM,BillsR(2003)Nuclearmarkersrevealunexpectedgenetic739

variationandaCongolese-NiloticoriginoftheLakeVictoriacichlidspeciesflock.740

ProceedingsoftheRoyalSocietyofLondonSeriesB,270,129-137.741

SpringerMS,GatesyJ(2016)Thegenetreedelusion.MolecularPhylogeneticsand742

Evolution,94,1-33.743

StamatakisA(2014)RAxMLVersion8:Atoolforphylogeneticanalysisandpost-744

analysisoflargephylogenies.Bioinformatics,30,1312-1313.745

SuchanT,BeauverdM,TrimN,AlvarezN(2015)AsymmetricalnatureoftheTrollius-746

Chiastochetainteraction:insightsintotheevolutionofnurserypollinationsystems.747

EcologyandEvolution,5,4766-4777.748

SuchanT,PitteloudC,GerasimovaNS,KostikovaA,SchmidS,ArrigoN,PajkovicM,749

RonikierM,AlvarezN(2016)HybridizationCaptureUsingRADProbes(hyRAD),a750

NewToolforPerformingGenomicAnalysesonCollectionSpecimens.PLoSONE,11,751

e0151651.752

35

SuchanT,EspíndolaA,RutschmannS,EmersonBC,GoriK,DessimozC,ArrigoN,Ronikier753

M,AlvarezN(2017)Datafrom:AssessingthepotentialofRAD-sequencingtoresolvephy-754

logenetic relationshipswithinspeciesradiations: the flygenusChiastocheta (Diptera:An-755

thomyiidae)asacasestudy.MendeleyData,https://doi.org/XXX756

SwoffordDL(2002)PAUP*:Phylogeneticanalysisusingparsimony(*andother757

methods).Version4.Sinauer,Sunderland,Massachusetts,USA.758

ToewsDP,BrelsfordA(2012)Thebiogeographyofmitochondrialandnuclear759

discordanceinanimals.MolecularEcology,21,3907-3930.760

TownsendTM,MulcahyDG,NoonanBP,SitesJW,KuczynskiCA,WiensJJ,ReederTW761

(2011)Phylogenyofiguanianlizardsinferredfrom29nuclearloci,anda762

comparisonofconcatenatedandspecies-treeapproachesforanancient,rapid763

radiation.MolecularPhylogeneticsandEvolution,61,363-380.764

WagnerCE,KellerI,WittwerS,SelzOM,MwaikoS,GreuterL,SivasundarA,Seehausen765

O(2013)Genome-wideRADsequencedataprovideunprecedentedresolutionof766

speciesboundariesandrelationshipsintheLakeVictoriacichlidadaptiveradiation.767

MolecularEcology,22,787–798.768

WalshHE,KiddMG,MoumT,FriesenVL(1999)Polytomiesandthepowerof769

phylogeneticinference.Evolution,53,932-937.770

WhitfieldJB,KjerKM(2008)Ancientrapidradiationsofinsects:challengesfor771

phylogeneticanalysis.AnnualReviewofEntomology,53,449-472.772

WhitfieldJB,LockhartPJ(2007)Decipheringancientrapidradiations.Trendsin773

EcologyandEvolution,22,258-265.774

36

WielstraB,ArntzenJW,vanderGaagKJ,PabijanM,BabikW(2014)DataConcatenation,775

BayesianConcordanceandCoalescent-BasedAnalysesoftheSpeciesTreeforthe776

RapidRadiationofTriturusNewts.PLoSONE,9,e111011.777

WilliamsJS,NiedzwieckiJH,WeisrockDW(2013)Speciestreereconstructionofa778

poorlyresolvedcladeofsalamanders(Ambystomatidae)usingmultiplenuclear779

loci.MolecularPhylogeneticsandEvolution,68,671-682.780

ZetterstedtJW(1845)Dipterascandinaviaedispositaetdescripta.IV,1281-1738.Lund,781

Sweden.782

ZinkRM,BarrowcloughGF(2008)MitochondrialDNAundersiegeinavian783

phylogeography.MolecularEcology,17,2107-2121.784

DataAccessibility785

DNAsequencesareavailableonGenbankunderaccessionsnoXXX-XXX.Nexusand786

STRUCTUREdatasetsusedfortheanalyses,MLphylogenyinferredforthemainRAD787

dataset,SVDquartetanalyses,andMLphylogenyinferredforthemtDNAdatasetare788

availableonMendeleyDatahttps://doi.org/XXX(Suchanetal.2017).789

AuthorContributions790

TS,NA,AE,KGandCDdesignedresearch,TS,AE,KG,andSRperformedresearchand791

analyzeddata.Allauthorswrotethepaper. 792

37

Figures793

794

795

Figure1.Phylogeniesobtainedfora)MLanalysisofthemtDNAdataset;b)MLanalysis796

oftheRADdataset;c)SVDquartetsanalysisofRADdataset;bootstrapnodesupports>797

80areshowndenotedbygraypoints,bootstrapnodesupports>90areshowndenoted798

byblackpoints.d)PopulationclusteringofthesampledChiastochetaspecimens,799

estimatedwithSTRUCTUREusingK=7value. 800

●●●●

●●●●

●●●●●●

●●●●●●●●

●●

●●● ●●●●●●●●●●●●

●●●●●●

●●●●●●

●●

●●●●●●

●● ●●

●●●●●●●●

●●●●

●●●●●●●●

●●●

●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●

●●● ●●●●●●

●●●

●●●●●●

●●●●●●●●●●●●●●

●●●●●●●●●● ●●● ●●●●●

●● ●●●●●●●●●●●●●●●●●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●● ●●●●●●●●●●

●●

●●●●●●

●●●

● ●

●●

● ●

● ●

a)

●●●●●●●

C. dentiferaC. inermellaC. lophotaC. macropygaC. rotundiventrisC. setiferaC. trollii

●●●●●●●●

●●●

●●●●●

● ●● ●●● ●●

●● ●●●●

●●●●●

●●●●

●●●●●●●

●●●

●●

● ●●●●

●●●

●●●● ●

●●

● ●●●●●●

● ●●●●

●●●●●● ●● ●●

●● ●●●

●●

●●●●

● ●●●● ●● ●●●

●●●●●

●●●●

●●●●●●

● ●●●●

●●●●●

● ●●●●

●●●●●●●

●●●● ●●● ●●

●●●●●●●

●● ●●●

●●●●●

●●

● ●●

●●●●● ●●

●●● ●● ●

● ●●●● ●

●● ●

●●●●●

●● ●●●●●

● ●●●

●● ● ●●●●●

●●

●●● ●●●●●

● ●●● ●●●●●

●●●

●●●●●●●●●●

●●●● ●●●●●

●● ●●●●●●

● ●●

●●● ●●●●●●●●●●●

●●●●

●●●

●●●

● ●●

●●● ●

●●●

● ●

● ●●●●●●

●●

●●●● ●●●●

● ●

●●

●●

●●

●●

●●●

●● ●●

b)

●●●●●●●●● ●

●●●●●●●●

● ● ●●●●● ●●

●●

●●● ●●●● ●●●●●●

● ●●●●●●●●

● ●●●

●●●●●● ●●

●●●●●●

●●● ●●

●●●●●●●●

●●●●●●

● ●●●●

● ● ●●

●●● ●●●●●●●●●

●●●●●●●●●

●●●●

●● ● ●●

●●● ●●●●●

●●●●● ●●●

●●●●●

● ●●● ●●

●●●●●

●●●●

●●

●●●●●●●●●

●●●●●

●●

●● ●●●●●●

●●●● ●●●●●●●

●●●●●●●●

●●●●●●●●●

● ●●●

●●

●●●●●●

●●●

● ●●●●●●

●● ●●●●●●

●● ●●●●●● ●●●

● ●●●●

●●●●●

● ●●●●●

● ●●●

●●● ●●

●●●●●●

● ●●

●●●

●●●

● ●

● ●●

●●

● ●

●●●

●●

●●

●●

● ●

●●

● ●●

●● ●

●● ●

● ●

c)c) d)

38

801

802

Figure2.Phylogenetictreesonthefourbins,asidentifiedbythetreeClanalysis,803

consideringonlythelocipresentinatleast100specimens.Bootstrapnodesupports>804

80areshowndenotedbygraypoints,bootstrapnodesupports>90areshowndenoted805

byblackpoints. 806

●●

● ●●

58 loci

● ●

●● ●

●●●

●●●●

●●

●●●

● ●●●

● ● 29 loci

●●

●●

●●

●●●

● ●

● ●● ●

●●

●●

● 42 loci

●●

●●● ●● ●

● ●

●● ●●● ●

●●

●● ●

● ●●

27 loci

39

Tables807

Table1.Populationsincludedinthestudy,withgeographicalcoordinatesandthe808

numberofspecimensusedinthefinalanalyses.LettercodesdenoteChiastocheta809

species:C.dentifera(D),C.inermella(I),C.lophota(L),C.macropyga(M),C.810

rotundiventris(R),C.setifera(S),andC.trollii(T).811

code site latitude longitude year D I L M R S T sum

AMB Ambri 46.50680 8.70292 2008 2 4 2 1 9 AMO Amot 59.62199 8.42346 2007 4 4 BAY Bayasse 44.30814 6.74067 2007 1 2 2 2 3 10 BEI Beistohlen 61.20761 8.95473 2007 5 5 BID Bidjovagge 69.29778 22.47808 2008 1 1 2 BON Colde

Bonnecombe 44.57557 3.11410 2007

3 1 1 2 7 BRA Braas 57.09309 15.06817 2007 1 1 CCO ColdelaCo-

lombière 45.98722 6.46972 2006

2 1 2 1 6 CDV CreuxduVan 46.93526 6.74119 2006 1 2 2 2 7 CHA Chasseral 47.12569 7.02130 2006 5 3 1 9 CHE Chemin 46.08993 7.08978 2006

3 1 3 2 1 2 12 CRA Crans-Montana 46.34650 7.53890 2006 1 1 3 1 1 7 CRE CressbrookDale 53.26724 -1.74041 2008 2 2 CTP ColtPark 54.19365 -2.35247 2008 4 1 1 6 DON Donovaly 48.88922 19.23068 2008 3 1 2 6 EID EiddaPastures 53.03720 -3.74190 2008 2 2 ELL Ellingsrudelva 59.91771 10.91844 2007 1 1 EPO Esposouille 42.62341 2.09450 2008

1 1 2 2 3 9 FRO Froson 63.18205 14.60268 2007 2 2

40

GAL ColduGalibier 45.08528 6.43861 2006 2 2 3 3 10 GLE GlenFender 56.78138 -3.79485 2008 2 2 4 HT1 HauteTinee1 44.29617 6.81871 2007 3 3 HT2 HauteTinee2 44.28426 6.85581 2007 3 1 4 KRA KrasnoPolje 44.80869 14.97271 2008 1 1 LAK Laktatjakka 68.42931 18.40674 2007 1 4 2 7 LFE LoughFern 55.06569 -7.71130 2008 1 1 LOS Loser 47.66052 13.78485 2007 4 3 1 8 MOE Moerlimatt 47.90597 8.07760 2007 1 1 1 1 4 MTP MontePizi 41.91524 14.16714 2008 2 2 NAV Naverdal 62.70417 10.13002 2007 3 1 4 PAJ PajinoPreslo 43.27799 20.81970 2008 2 2 PAN Puertode

Panderrueda 43.12743 -4.97223 2008

1 1 3 2 1 8 PIL Pila 48.90017 20.29449 2008

3 2 1 6 POD Podlesok 48.94962 20.35190 2008 1 1 2 PPN PetitPapaNoel 66.51647 25.79386 2007

1 3 2 3 9 PYD PuydeDome 45.77222 2.96333 2006 2 2 2 6 PYM PuyMary 45.11139 2.68083 2006 1 1 PYS PuydeSancy 45.53500 2.80972 2006 3 2 1 6 RAD Radkow 50.46866 16.35321 2008

2 4 2 8 RIS Risnjak-Snjeznik 45.43871 14.58494 2008 1 2 3 SAL Salla 66.83020 28.65427 2007

1 2 1 4 8 SED SededePan 43.03949 -0.48651 2008 2 2 SET Seterasen 65.53432 13.67744 2007

1 4 1 1 1 8 SOL Solberga 57.95194 13.56116 2007

4 2 2 1 9 STE Steingaden 47.59529 11.01296 2007 1 1 31 5

41

STR Straumen 67.38440 15.64921 2007 2 2 SUS Susch 46.74728 10.07473 2006

2 2 2 1 3 10 SVA Svartla 65.99583 21.22062 2007

3 3 6 TAR Tarasp 46.77730 10.25056 2006 1 2 1 3 7 VIT Vitosha 42.59032 23.29342 2008 2 2 ZAL ZaliLog 46.20342 14.11080 2008 3 2 1 6

total: 21 44 42 33 59 34 38 271

812

42

AppendixA.Supplementarymaterial813

TableS1.SummarystatisticsfortheRAD-sequencedsamples:numberofRAD814

fragmentsclustersandmeancoverageafterretainingclusterswithacoverage>5,815

estimatedheterozygosities,numberofconsensuslociafterparalogfiltering,andthe816

numbersoflociretainedforeachdatasetafterfilteringforcoverageamongthesamples.817

818

Fig.S1.MapofthesampledChiastochetaspecimensusedinthestudy.819

820

Fig.S2.Theeffectofdifferentclusteringthresholds(X-axis)andminimumloci821

coverages(indicatedbycolors:red–10,blue–20,green–100individuals)onthetotal822

numberofassembledloci,proportionofmissingdata,locioverlapamongthetechnical823

replicates,andmeannumberofindividualsperlocus.824

825

Fig.S3.PatternofRAD-seqlocisharingamongthesequencedindividualsfordatasets:826

a)themaindatasetusingclusteringsimilarityof75%andminimumlocicoverage827

amongindividualsof20;b)usingclusteringsimilarityof75%andminimumloci828

coverageamongindividualsof100.829

830

Fig.S4.a)MLphylogenyinferredforthemtDNAdataset;b)MLphylogenyinferredfor831

theRAD-seqdataset;c)SVDquartetsphylogenyinferredfortheRAD-seqdataset;832

bootstrapnodesupports>80areshowndenotedbygraypoints,bootstrapnode833

supports>90areshowndenotedbyblackpoints.834

835

Fig.S5.STRUCTURErunsforK=2to7,plottedagainsttheRAD-seqbasedphylogeny.836

43

Fig.S6.Phylogenetictreesonthelocipartitionedintothesetsof2to6clusters,837

consideringonlythelocipresentinatleast100specimens.Bootstrapnodesupports>838

80areshowndenotedbygraypoints,bootstrapnodesupports>90areshowndenoted839

byblackpoints.Numbersbelowthetreesdenotethenumberofclustersintowhichthe840

datasetwasdivided.Plotoflog-likelihoodimprovementversusthenumberofclusters841

ispresentedinthefirstbox.842

843

AppendixS1.RAD-sequencingprotocol. 844