Upload
cyndy-parr
View
2.241
Download
1
Embed Size (px)
DESCRIPTION
A presentation to the Genomic Standards Consortium 15 meeting in Bethesda, MD on 23 April 2013
Citation preview
Encyclopedia of Life
eol.org@eol
Cynthia Parr@cydparr
GSC15 23 April 2013
A webpage for every species
How EOL works
EOL
Crowds
Harvest
Third party applications
EOL
Plinian Core
DwCdescription
SPMinfoitem
usingDublin Core & Audubon Core for other metadataDarwin Core Archive flat files as transport mechanism
Sharing process adds semantics to content objects
EOL Today
Key Milestones in 2013
1.1 million species pages
240+ content providers
3 million unique visitors from 223 countries & territories
EOL
GBIF
NCBI
with Anne Bowser, University of Maryland
EOL connects hubs
BioNames
Rod PageRyan Schenk
iphylo.blogspot.com
Anatolia Zooarchaeology Case Study led by Alexandria Archive Institute
Research goals and outcomes:– Improve archaeological data collection /
documentation practices– Better understanding of gaps (spatial
and temporal)– Integrated biometrics show complex
patterns (introduction of domestic and continued use of wild animals by region)
– Aligning data to EOL taxon identifiers helps draw out patterns in relative proportion of taxa over time and space across many assemblages
EOL Computable Data Challenge1. 14 different sites2. 34+ zooarchaeologists3. Decoding, cleanup, metadata documentation4. 220,000+ specimens5. 450 entities linked to 143 EOL taxon concepts6. Anatomical entities linked to Uberon.org7. Biometrics linked to measurement ontology 8. Collaborative analysis
http://opencontext.org/
DistributionMolecularBiology
Multiple topicsTypeInformation
HabitatConservationStatus
ThreatsMorphology
ConservationManagement
TrendsSize
AssociationsUses
TrophicStrategyCyclicity & Life Cycle
PopulationBiologyReproduction
MigrationTaxonomy
LifeExpectancyIdentification
BehaviourEcology
Diseases
0 100000 200000 300000 400000 500000 600000 700000 800000
Number of text objectsSu
bjec
t of t
ext o
bjec
t
Promote NLP text mining, crowdsourcing, standardizing
• Species Interaction Datasets—Integration, Visualization, and Analysis (Poelen and Mungall)
• Discovering EnvO habitat terms in EOL contents (Pafilis)
• Altitude Specificity of Flower Coloration (Wright)• Crowd-sourced data to examine morphological
impacts of extinction risk in ray-finned fishes (Chang)
• Macroecological patterns in butterfly-hostplant associations (Ferrer-Parris)
EOL GloBIGlobal Biotic Interactions
Challenge: Species interaction datasets are mostly buried in flat files & custom formats.
Plan: Build infrastructure for normalizing and aggregating species interaction datasets and make them accessible through flat files (Darwin Core Archive), web services, and semantic web endpoints (SPARQL).
Eventually: Publish biotic interaction ontology re-using existing ontologies, re-integrate with EOL
Enable semantic interoperability to allow for cross-functional analysis (e.g. How does a parasite regulate gene expression of host?
Poelen, Mungall, Simons, Reiz
http://globalbioticinteractions.wordpress.com/
14 datasets containing 25k taxa, 422k interactions, for 3k locations
alpha version of ingestion, normalization, aggegation
alpha version of web APIalpha version of data
exports
Dr. Katy Börner ledInformation Visualization MOOC
Easy access to analyzable trait data
“Are blue organisms more common in high altitudes?” “How can I predict vulnerability to climate change based on life history characteristics?” “What organisms should I collect to fill in gaps in genome quality tissue collections?”
• Look for data type, download for all taxa• Create a collection of taxa, download all data• Use Reol: an R interface to EOL (Banbury, Omeara)http://barbbanbury.info/barbbanbury/Reol.html• Find more specialized data repositories
Adding traits to EOL
Funded: Marine focus
<scientific name> <hasAvgBodyMass in g> <value><scientific name> <preysOn> <scientific name>
Harvest and display on data tabAdd high-level semantics from coarse SPM ontologyDownloads, fancy searchingMachine access
INSDC 900,000 species 4,000 genomes
60 million DNA sequence records
How are these related to traits?Next step: TraitBank
ThanksFunding & other contributionsSloan FoundationSmithsonian InstitutionDavid RubensteinMarine Biological LaboratoryHarvard UniversityOur content partnersThousands of individual contributors, and hundreds of volunteer curators
Image credits
Jenny from Taipei
University of Birmingham
Cynthia ParrChief Scientist @eol
@cydparr [email protected]
GLoBI: Jorrit Poelen (lead/software), Chris Mungall (ontologies), James Simons (biologist) and Robert Reiz (software). Datasets shared by: Peter D. Roopnarine, Rachel Hertog, Carlos García-Robledo, James Simons, Jenny L. Wrast, C. Barnes, International Council for the Exploration of the Sea (ICES), Jose R. Ferrer Paris, Senol Akin, Malcolm Storey (BioInfo.org.uk), Ivy E. Baremore, Joel Sachs (SPIRE), Colt W. Cook, David A. Blewett
Alexandria Archive: Sarah Kansa, Eric Kansa, 34 other zooarchaeologistsBioNames: Rod Page, Ryan SchenkMOOC: Katy Börner, Twy Bethard, Andrew Miles , Mattia Della Libera