27
The Future of Microalgal Taxonomy Anne Thessen, [email protected] David Patterson [email protected] (Data Conservancy, Life Sciences)

The Future of Microalgal Taxonomy

Embed Size (px)

DESCRIPTION

This talk describes the potential semantic web technology has to make the practice of taxonomy easier. It was presented at the 2011 Phycological Society of America conference in Seattle, WA, USA.

Citation preview

Page 1: The Future of Microalgal Taxonomy

The Future of Microalgal TaxonomyAnne Thessen, [email protected] Patterson [email protected](Data Conservancy, Life Sciences)

Page 2: The Future of Microalgal Taxonomy

Scientist’s Dream

Computer, what is the trajectory of

the planet Seti Alpha 5?

Page 3: The Future of Microalgal Taxonomy

Taxonomist’s Dream

How many algal species can be found

on this planet?

Page 4: The Future of Microalgal Taxonomy

Taxonomist’s Dream

What species is this?

Page 5: The Future of Microalgal Taxonomy

Taxonomist’s Dream

Page 6: The Future of Microalgal Taxonomy

Taxonomist’s Dream

Page 7: The Future of Microalgal Taxonomy

Setting the stage for a ‘big new biology’

• BIG = data-centric (like particle physics and astronomy)

• Characterized by data sharing via a virtual pool

• New = new skill sets, tools, cyber-infrastructure to exploit the data pool

• Data driven discovery as a new means of understanding

• GenBank as a model within the Life Sciences

Page 8: The Future of Microalgal Taxonomy

Small science

Large number of providers with small amounts of data.

Small number of providers with lots of data.

Page 9: The Future of Microalgal Taxonomy

Aa paleacea

Limulus polyphemus

Kiwa hirsuta

Osedax frankpressi

Kingia australis

Names

Pieris japonica

Pieris rapae

Trypanosoma brucei

Homo sapiens

Page 10: The Future of Microalgal Taxonomy

Many names for one taxon

Didimosphenia geminata

Didymosphenia geminata

Didymosphenia geminata

Didymosphenia geminata

Rock snot

Didymo

Echinella geminata

Gomphonema geminatum

Gomphonema vulgare

Page 11: The Future of Microalgal Taxonomy

Reconciliation Group

Didymosphenia geminataDidimosphenia geminataDidymoRock SnotEchinella geminataGomphonema geminatumGomphonema vulgare

Page 12: The Future of Microalgal Taxonomy

Reconciliation Group

Didymosphenia geminataDidimosphenia geminataDidymoRock SnotEchinella geminataGomphonema geminatumGomphonema vulgare

Page 13: The Future of Microalgal Taxonomy

One name for many taxa

Cyclophora tenuis Cyclophora Castracane 1878

Cyclophora Cyclophora Hübner 1822 Cyclophora porata

.

Contextual data

DiatomChloroplastFrustuleBenthicMarine

Disambiguate by authority, species, contextual data

Contextual data

FoodMoth

WingsExoskeleton

Caterpillar

Page 14: The Future of Microalgal Taxonomy

Global Names Architecture

Provider Services

DATA AND SERVICE CONSUMERS

DATA AND SERVICE PROVIDERS

EXPERTS

Consumer Services

GNA

Page 15: The Future of Microalgal Taxonomy

Names-based cyberinfrastructure

• Managing names to manage biodiversity data- All names (scientific vernacular surrogate)- For all organisms- Many names for one species reconciled- One name for many species disambiguated

• Global Names Architecture - a virtual layer, using names services to link together

distributed data• Globalnames.org• Micro*scope (microscope.mbl.edu) and

Encyclopedia of Life (eol.org)

Page 16: The Future of Microalgal Taxonomy

Legacy Data

• Narrative tradition in biology

• Too much for a human• Can we get a machine

to do the work?• NLP!!!

Page 17: The Future of Microalgal Taxonomy

Legacy Data

• Use NLP/machine learning to extract names and characters

• Hong Cui

Page 18: The Future of Microalgal Taxonomy

Legacy Data

• Spirogyra:chloroplasts:present

Page 19: The Future of Microalgal Taxonomy

Legacy Data

• Spirogyra:chloroplasts:present:attribution

Page 20: The Future of Microalgal Taxonomy

Coffee Ontology

coffee

is a

drink

Page 21: The Future of Microalgal Taxonomy

Existing Ontology

Page 22: The Future of Microalgal Taxonomy

Semantic Web

Page 23: The Future of Microalgal Taxonomy

Data Discovery and Aggregation

Page 24: The Future of Microalgal Taxonomy

Future Data

Triple Store

Page 25: The Future of Microalgal Taxonomy

The New Workforce

• Informatics/computing training• Modified workflows• Importance of data management and

preservation

Page 26: The Future of Microalgal Taxonomy

In Summary

• Big New Biology is coming, taxonomy can benefit from being a part of it

• Existing data can be made machine-readable using information extraction algorithms

• Existing workflows can be modified to capture data close to the source

• Data can be shared using the semantic web

Page 27: The Future of Microalgal Taxonomy

Acknowledgments

• Dima Mozzherin• David Shorthouse• Sayeed Choudhury• Pete DeVries