Upload
jacoby-gratton
View
212
Download
0
Tags:
Embed Size (px)
Citation preview
Connect.barcodeoflife.org
• Promote barcoding as a global standard
• Build participation
• Working Groups
• BARCODE standard
• International Conferences
• Increase production of public BARCODE records
Networks, Projects, Organizations
Barcode of Life Community
Principles and Goals• Free and open access• Standardization and scalability• Specimen-centered• Rapid data release following primary QA/QC• Ongoing crowd-sourced data curation • Enable accelerated modern taxonomy• Navigate across data types (DNA, specimens,
species, publications, georeferences)• Locate, aggregate, display and analyze data,
resources
How Barcoding Works• Building the reference library:
– Well-identified specimen– Tissue subsample– DNA extraction, PCR amplification– DNA sequencing– Data submission to GenBank
• Using the reference library:– Unidentified specimen– Tissue, DNA, sequencing– Comparison with reference sequences
How Barcoding is Done
From specimen to sequence to species
Voucher Specimen
DNA extraction CO1 gene DNA sequencing Trace file
Public Databases of
Barcode Records
Collecting
ND3
COIII
ND2
ND1
NBII, 25 February 2009
BOLD Workbench for Barcode Data Assembly/Analysis
GenBank, EMBL, and DDBJOfficial Archival Repositories of Barcode Data
http://www.insdc.org/
Current Norm: High throughputLarge labs, hundreds of samples per day
ABI 3100 capillary
automated sequencer
Large capacity PCR and
sequencing reactions
● US$100-150K purchase ● 2-3 hours processing time● 150-500 samples per day ● US$3-5 per sample
Technology Development Partnership Goal
The DNA Sequencing
Lab of 2013?
Producing Barcode Data: 201?Barcode data anywhere, instantly
• Data in seconds to minutes
• Pennies per sample• Link to reference
database• A taxonomic GPS• Usable by non-
specialists
Status of Barcode Data• BOLD records (public and private):
– 956,000 records, 78,000 named species• BARCODE records in GenBank:
– 194,000 records– Insects: 150,000 records– Fish: 23,500 records– Birds: 6,000 records– Mammals: 2500 records
BARCODE Data StandardRequired Elements for COI
• Species designation• Voucher ID in standard Darwin Core format• Minimum 500 bp, >1% ambiguous sites• Bidirectional overlapping reads, 2 trace files• Primer name and sequences• Country/ocean region• Strongly recommended:
– Collection date and collector– Identifier– Latitude/longitude
Non-COI regions for other taxa
• Land plants:– Chloroplast matK and rbcL approved Nov 09– Non-coding plastid and nuclear regions being
explored• Fungi and protists:
– CBOL Working Groups convened– Recommendations expected in 2010
Barcode Sequence
Voucher Specimen
Species Name
Specimen Metadata
Literature(link to content or
citation)
BARCODE Records in INSDC
Indices - Catalogue of Life - GBIF/ECAT
Nomenclators - Zoo Record - IPNI - NameBank
Publication links - New species
GeoreferenceHabitat
Character setsImages
BehaviorOther genes
Trace filesOther
DatabasesPhylogenetic
Pop’n GeneticsEcological
Primers
Databases - Provisional sp.
Linkout from GenBank to BOLD
ISBER: 13 May 2009
Linkout from GenBank to Taxonomy
ISBER: 13 May 2009
Link from GenBank to Museums
Washington Airport Gate 3
• Dulles, National, or Baltimore-Washington?• 2 concourses at BWI concourse A or B?• 3 concourses at National• 4 Dulles concourses
The Controlled Vocabulary of Airport Codes
Darwin Core TripletStructured Link to Vouchers
Institutional Acronym
Collection Code
Catalog ID: :
Structured Link to Vouchers
NHM LEP 123456: :
personal DHJanzen SRNP12345: :
NCBI’s Biorepository List
• Compiled from Index Herbariorum, literature sources, GenBank submissions
• 6,936 records• 1,177 records with non-unique acronyms• 517 homonymous acronyms• 374 shared by two records• 143 shared by three records
AMNHIcelandic Institute of Natural History, Akureyri Division Akureyri Iceland
AMNH American Museum of Natural History New York USA
UNL Universidad Autónoma de Nuevo León Monterrey, Nuevo León Mexico
UNL University of Nebraska State Museum Lincoln, Nebraska USA
UNLCentro de Estratigrafia e Paleobiologia da Universidade Nova de Lisboa Monte de Caparica Portugal
ZMK Zoological Musem, Kristiania Oslo Norway
ZMK Zoologisches Museum der Universität Kiel Kiel Germany
ZMK Zoological Museum, Copenhagen Copenhagen Denmark
CBOL/GBIF/NCBI Registry of Biorepositories
www.biorepositories.org
Mixture of:• Single collections• Repository institutions• Networks/consortia• Databases• NGOs
Does NOT include:• GenBank• EMBL• DDBJ• BOLD
What Should We Do?CBOL will invest a year to populate institution and collection data in biorepositories.org • Hope to build synchronization with:
– Institution database at GenBank– Index Herbariorum– Authority files in BOLD
• Hope to install web services • How can we accelerate registration process?• Where should the data reside long-term?
– GenBank?– GBIF?