Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
Tipsfortaxonomic
cleaningwiththeTNRS
BradBoyleUniversityofArizona
9January2016
Taxonomiccleaning
• Whybother?• Taxonomicscrubbingapplications• Generalglitchesandgotchas• TNRSglitchesandgotchas• Pre-processing• Post-processing• Understandingtheoutput
Taxonomiccleaning:whybother?
HieronymaoblongaWidespreadtropicaltree
Taxonomiccleaning:whybother?
HieronymaoblongaWidespreadtropicaltree
HieronymapoasanaSynonymofHieronymaoblonga,oncethought tobeendemictoCostaRica
Taxonomiccleaning:whybother?
HieronymaoblongaWidespreadtropicaltree
HieronymapoasanaSynonymofHieronymaoblonga,oncethought tobeendemictoCostaRica
HyeronimaoblongaHieronimaoblonga
CommonmisspellingsofHieronymaoblonga
Whybother?
10%“bad”names
Whybother?
Overlapbetweendatabasesonly3%!
Whybother?
400%increaseinoverlap
Taxonomiccleaningapplications
• TNRS• (http://tnrs.iplantcollaborative.org/index.html)
• TaxonStand• http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210X.2012.00232.x/full
• GlobalNameResolver• http://resolver.globalnames.org/
• PlantMiner• http://www.plantminer.com/
• Manyothers…
Generalarchitecture
• Nameparser– Breaksupandclassifiesnamecomponents
HieronimapoasanaStandley
Specificepithet
Genus Authority
Generalarchitecture
• Nameresolver– Matchesthenametoreferencedatabase– Triesfuzzymatchingifexactmatchfails
HieronimapoasanaStandley
HieronymapoasanaStandl.
Misspelled
Correctspelling(aspublished)
Generalarchitecture
• Taxonomicstatus&synonymconversion– Someapplicationsdonodothislaststep
HieronymapoasanaStandl.
Hieronymaoblonga(Tul.)Müll.Arg.
Synonym
Currentlyacceptedname
ExampleworkflowwithTNRSAPI
• Script:tnrs_api_example.R• Steps:1. Extractthenames2. Turnintoastringseparatedbycommas3. URL-encodeandsendtotheTNRSAPI4. ConvertthereturnedJSONtodataframe5. Updateyournames
ProsandConsofTNRSAPI• Advantages
– Fast,simple,fullyautomated• Disadvantages
– Can’tadjustallsettingsavailableinwebinterface– UsesTropicosasonlysource– Can’ttakeadvantageofwebinterfacetoinspectresults,choosealternativematchesandresearchnames
– Can’taccessdownloadoptionsavailableinwebinterface
– Parse-onlyoptionnotavailable
ExamplebasicworkflowwithTNRSwebinterface
• Script:tnrs_gui_example.R• Steps:1. ExtractnamestoCSVfilewithtwocolumns:UniqueID&
names2. UploadtoTNRSusingbulk“UploadandSubmitList”tab,
checkingbox“Myfilecontainsanidentifierasfirstcolumn”3. Adjustnameprocessingsettingsandsubmit4. Inspectresultsonline,selectingalternatematchesif
appropriate5. Downloadresults,usingoptions:Bestmatchesonly,
Detailedresults,UTF-8format6. ImportTNRSresultsastab-delimittedfile7. RemainingprocessingasforAPI
ProsandConsofTNRSWebInterface
• Disadvantages– Notfullyautomated
• Advantages– Canadjustnameresolutionsettings– Morenameresolutionsources– Usewebinterfacetoinspectresults,choosealternativematchesandresearchnames
– Selectanddownloadalternativematchesonthefly– Moredownloadoptions,including“Allmatches”(usefulifyoudon’tlikehowTNRSchoosesbestmatchandwanttoscriptityourself)
– Parse-onlymore(usefulforcomparingpartoforiginalnametomatchedname)
TNRSTips&Gotchas
• Tip:Pre-pendfamilytonametopreventmatchingsimilarnamesindifferentfamilies
• Gotcha:IfyouwanttouseThePlantList,*always*selectTPL+ILDIS+GCCtogether
• Tip:ResearchanynamewhereTaxonomicStatus<>AcceptedorSynonym
• Gotcha:Evenacceptednamescanbewrong!
TaxonomicStatusTaxonomicStatusreferstotheMatchedName
• Accepted:Goodtogo!• Synonym:Goodtogo,aslongasacceptednamesupplied• Noopinion:Couldbegoodorbadname.RESEARCHIT• Invalid:Nevervalidlypublished.DON’TUSE• Illegitimate:Violatesnomenclaturalrules.DON’TUSE• Rejectedname:Rejectedbynomenclaturalcommittee.DON’TUSE
• Misappliedname:Commonlymisappliedtothethewrongspecies.Mayormaynotbecorrect.RESEARCHIT
Evenacceptednamescanbewrong!
Name submitted Tropicos The Plant List
Henriettea fascicularis =Henriettella fascicularis =Henriettella fascicularis
Henriettea ramiflora Accepted Accepted
Henriettea succosa Accepted Accepted
Henriettella fascicularis Accepted Accepted
Henriettella tuberculosa Accepted =Henriettea tuberculosa
Actually,allbelonginHenriettea