27
iDigBio is funded by a grant from the National Science Foundation’s Advancing Digitization of Biodiversity Collections Program (Cooperative Agreement EF-1115210). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. All images used with permission or are free from copyright. Insights from Advancing the Digitization of Biodiversity Collections (ADBC) Deborah Paul, Greg Riccardi, Gil Nelson iDigBio, Florida State University ICEDIG 5-6 March 2018 @idbdeb @griccardi @iDigGilNelson @iDigBio

Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

iDigBio is funded by a grant from the National Science Foundation’s Advancing Digitization of Biodiversity Collections Program (CooperativeAgreement EF-1115210). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do notnecessarily reflect the views of the National Science Foundation. All images used with permission or are free from copyright.

Insights from Advancing the Digitization ofBiodiversity Collections (ADBC)

DeborahPaul,GregRiccardi,GilNelsoniDigBio,FloridaStateUniversityICEDIG 5-6March2018@idbdeb @griccardi @iDigGilNelson @iDigBio

Page 2: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

2

Topics

• ADBC Model Integrated Collections Network• Community Building• Resources developed• Lessons learned• Key components of such a program

Page 3: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

3

How to get digitisation going (c. 2009)

• Step 1: Make a plan and get funding• Step 2: Create a central coordination program• Step 3: Fund digitization projects• Step 4: Digitize and organize• Step 5: Publish and use data

• ∞: figure out how to keep going

Page 4: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

4

ADBC: Advancing Digitization of Biodiversity Collections

• A call from the community > NIBA• US National Science Foundation

– Budgetis$100millionover10years,weareinyear7.• The goal is to digitize and aggregate

– 100s ofMillionsofbiologicalandpaleontologicalrecordsoverthe10-yearlifeoftheproject.

• iDigBio project is the hub of ADBC– UofFlorida,FloridaStateU

• Digitization projects– FundedbyNSFpeerreview– 20ThematicCollectionsNetworks

• We are encouraged by our funders to collaborate– Issuesareglobal,effortneedstobeglobal

Page 5: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

5

ADBC is iDigBio and the Thematic Collection Networks (TCNs)

Credit: Malcolm Burrows

Page 6: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

6

NIBA Strategic Plan 2010

• Vision Statement: NIBA will– Developaninclusive,vibrant,partnershipofU.S.

biologicalcollections– Documentthenation’sbiodiversityresources– Createadynamicelectronicresource– Servethecountry’sneedsinansweringcriticalquestions

abouttheenvironment,humanhealth,biosecurity,commerce,andthebiologicalsciences

Page 7: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

7

What does iDigBio do?• Enable digitization of biodiversity collections data

– Developefficient&effectivestandards&workflows– Workforceeducation&training

• Provide portal access to biodiversity datain a cloud computing environment– Respondtocyberinfrastructureneeds– Enableaccess&discoverability

• Facilitate use of biodiversity data toaddress key environmental andeconomic challenges– Researchers,educators,generalpublic,

policy-makers,…• Plan for long-term sustainability of the

national digitization network & effort– Expandparticipation:partners,datasources,public,…– Proliferateandbroadenusesofbiodiversitydata

Cyber-infrastructure

Digitization

Education &Outreach

Serving theResearch

Community

Page 8: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

8

iDigBio Mission to Coordinate:• Engaging the collections community• Facilitating digitization & mobilization of data• Providing portal and API access to data• Facilitating research and outreach

108,000,000+

Page 9: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

9

Advancing Digitization of Biodiversity Collections ADBCNational Digitization Network675 participating collections in 336 institutions (20 TCNs + 23 PENs)

Vertebrates,invertebrates,

plants, fossils, fungi,tissues, sounds,

videos, 2D, 3D, …

iDigBio Portal has1,537 recordsetscontaining 105M

records for ≈318Mspecimens with23M associatedmedia records

Page 10: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

10

Thematic Collections Networks (2 of 20)…

Page 11: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

11

SCAN TCN

a Data Portal Built toVisualize, Manipulate,

and Export SpeciesOccurrences

• Southwest Collection of Arthropods TCN evolves– into SymbiotaCollectionsofArthropodsNetwork– Fromoneprojecttomanyprojects– Supportedbyacommonplatform– Customizedbasedoncommunityinput

• 3 TCNs SCAN, LepNet, and InvertEBase• Each museum or project is a separate collection in the database

– butallcollectionssearchabletogether

Page 12: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

12

More Thematic Collection Networks Highlights

• Thiers – MaCC to MiCC– infrastructure,community

• Experimenting with light-field photography– InvertEBase

• Linking data – ePANDDA

• Still need to go 5x faster (Cobb) –– georeferencedandid,– gapsforDiptera,predator– parasitoid

Page 13: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

13

county=“devon”2,124

Education OutreachCitizen ScienceK-12 materialsUndergraduateFossil ClubsMentor teachers

Education OutreachCitizen ScienceK-12 materialsUndergraduateFossil ClubsMentor teachers

Community Building Activities

DigitizationWorkflowsProtocolsTask ClustersDissemination

Research UseTool collaborationPortal developmentENM workshopResearch SpotlightData quality

TrainingBiodiversity data skillsData literacyCollections softwareImagingProject Management

Activities

DigitizationWorkflowsProtocolsTask ClustersDissemination

Research UseTool collaborationPortal developmentENM workshopResearch SpotlightData quality

TrainingBiodiversity data skillsData literacyCollections softwareImagingProject Management

Page 14: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

14

iDigBio Success: Workshop Principles

• Community-driven process• Each workshop

– Createdinresponsetoneed– Organizedbyinterestedparties– Attendedbydiversegroup

• Geographical distribution of sites• Demographic distribution• Repository of materials for all

Page 15: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

15

Workshops reveal pattern of skills needs and knowledge gaps

• What skills are needed to mobilize and use the data?

Page 16: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

16

Workshops reveal pattern of skills needs and knowledge gaps

• Digitisation workflow workshops– FlatSheetsandPackets,Pinned

SpecimensinTraysandDrawers,ThingsinSpirits,3D objectsinTrays,Imaging,…

• Capacity building needs revealed– software– standards– datacleaningandmanagement– spreadsheets,textfiles– datavisualizationandsynthesis– recognizingautomatabletasks– limitednumberofpeopleinthe

communitywiththenecessaryskills

Actions

• PartnerindevelopingandimplementingDataCarpentry,now

• BiodiversityInformaticsWorkshopSeriesatiDigBio– DataCarpentry– ManagingNHC Data– DemystifyingDataStandardsandtheIPT– FieldtoDatabase

• PartnerinBiodiversityInformatics101atSPNHC

• PartnerinDarwinCoreHour

Page 17: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

17

Developing a Collections Digitization and Data Use Community

0

2

4

6

8

10

12

14

16

2011 Year1

2012 Year2

2013 Year3

2014 Year4

2015 Year5

2016 Year6

2017 Year7

2018 Year8

2019 Year9

2020 Year10

2021 Year11

2022 Year12

2023 Year13

iDigBio’s Evolving Focus

Discovery andDevelopment

Digitization BestPractices

Research Use of NHCData

Page 18: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

18

Cool research uses

Predicting Extinction Using convolutional neuralnetworks to automate tropicalpollen counts and identification

Collecting trends: how wars and human historyinfluence biological collections

Sinervo, B. et al. Erosion of lizarddiversity by climate change and alteredthermal niches. Science 328, 894-899(2010)

Derek Haselhorst, Program in Ecology, Evolution andConservation Biology, University of Illinois, iDigBioResearch Spotlight: September 2017

Vaughn Shirey, The Academy of Natural Sciences of Drexel University,now here at Luomos, in iDigBio Spotlight: March 2018

Page 19: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

19

Page 20: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

20

More research published

• Workflows• Digitisation methods• Imaging, CT, recordings, CNN• Phenology• Public participation• Georeferencing• Small collection import

– gapanalysisstrategicplanning,research– awareness

Page 21: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

21

Exemplary initiatives• Entomological Collections Network

– createsacohesiveentomologicalcollectionsfamily• SPNHC CC Network and EPG

– Universityaffiliated– createsmomentum,addresseslimitedexpertiseandmoney,while

capitalizingonopportunitiesforstudentsandearlycareerprofessionalstodrivechange

• We Dig Bio– PublicParticipation– Visibilityandengagementinlocalandworldwideevents

• createsworldwiderelevanceforyoursmallcollectionandcommunity• NANSH.org

– offersacompletemodelfromhowtoorganizeand– wheretoshareresults

• The Carpentries– foundationalbiodiversityinformaticsskillsandliteracyfor

reproducibleresearch

Page 22: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

22

My wish list for DiSSCo

• set up plan for data flow before beginning– databacktoproviders– strongdatastandards– prescriptive andproscriptive examples

• explicit identifier recommendations /requirements

• implement annotations collaboratively• set up a clear citation / attribution strategy

– fortheproject,fordata,forcollections– branding,socialnorms,automated– visualizeresearchdone

• managing media is also challenging– “sendmeaharddrive”stillexists– dataproviderinfrastructureaccessissues– addressarchivalstorage

• (require?) robust collections metadata• encourage use of extensions, or other

methods for getting richer recordsets• support need for (improving / offering)

– taxon,locality,peopleserviceandauthorityfiles

• duplicate / related-object finding• networked – so users can tell which

aggregators have which recordsets• support metrics needs• support media analysis (ML, CNN, …)• hardware / software general

recommendations?– publiccloudoropensourcesoftware

stackswherepossible• capacity + community building

Page 23: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

23

Where does iDigBio go from here?

• Limited time program (10 years)• How to sustain activities?

– Digitizationprojects?– Supportfordigitizationimprovements?– Datamobilizationanduseskills?

• How to sustain data infrastructure?– Datapersistence?– Dataquality?– Dataportal?

• How to sustain commitment– Governmental?– Community?

Page 24: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

24

iDigBio Successes: Using Data• Take a look at the monthly Research Spotlight and Research on our website.

• Watch the presentations and read discussions from the iDigBio workshopUsing Biodiversity Specimen-Based Data to Study Global Change.

• Be Ignited by speakers at the Ecological Society of America 2015 sessionEnhancing Ecological Research with iDigBio Biological Specimen Data.

• Find out more about Big Data and Bugs: How Massively Collected Biodiversity DataAre Changing the Way We Do Insect Science at the Entomological Society of America2017.

• Listen to Gil Nelson’s talk highlighting Research Outcomes of the ADBC Community’sEfforts to Digitize Data for Biodiversity Research at iDigBio's Summit VII 2017.

• Discuss open research project ideas on GitHub with iDigBio and collaborators.

• Check out GUODA and Effechecka and Fresh Data

Page 25: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

25

oVert

Effechecka

Page 26: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

www.idigbio.org

facebook.com/iDigBio

twitter.com/iDigBio

vimeo.com/idigbio

idigbio.org/rss-feed.xml

webcal://www.idigbio.org/events-calendar/export.ics

iDigBio is funded by a grant from the National Science Foundation’s Advancing Digitization of Biodiversity Collections Program (CooperativeAgreement EF-1115210). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s)and do not necessarily reflect the views of the National Science Foundation. All images used with permission or are free from copyright.

Kiitos paljon ICEDIG, Anna palaa

Thanks a lot ICEDIG, Go for it!

Page 27: Insights from Advancing the Digitization of Biodiversity … · 2018-03-12 · 4 ADBC: Advancing Digitization of Biodiversity Collections • A call from the community > NIBA •

27

Thematic Collections Networks (TCNs)and Partners to Existing Networks (PENs)TCN: network of institutions strategically digitizinginformation for a particular research theme, such as impactsof climate change or biota of a region.

The Mid-Atlantic Megalopolis

Cretaceous World

SoRo

oVert