Upload
agosti
View
441
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Lecture presented at the Journals Club of the Naturhistorisches Museum Bern, March 17, 2014. "Towards an (European) Open Biodiversity Knowledge Management System"
Citation preview
Towards an (European) Open Biodiversity Knowledge Management System
Donat Agosti (Plazi, Bern)
March 17, 2014
Berne, Journal Club @ NMBE
El Bulli: Cooking in Progress (2011) Ferran Adria (Actor), Gereon Wetzel (Director)
The cook (Ferran Adriá) wants to know when he can expect what seafood for his kitchen.
He assumes that phenological data is open and accessible to anyone.
He has a question and needs to know: What seafood at what time?
His goal is to provide a service based on the use of observation data, i.e. treat you (and make some money).
The fishmonger knows when what seafood is available.
He considers his knowledge of seafood phenology as his asset to make money.
His goal is to make money with knowledge based on observation records and understanding the characteristics of seafood.
What do YOU want to know?
How do YOU expect to get to your information?
• What are the main online resources you use?
• Do you maintain your own digital library?
• Do you participate in an online project, egscratchpads, catalogue, digital archive andmake your data accessible?
• … ?
What does this mean?
Meredith Lane, e-biosphere Conference, London 2009
Hardisty, Nature 502, 171 (2013)
BUT: predictive ecology has substantial data needs
Harfoot, BIH2013, Rome, 2013
The big question
What is the future of the biological world?
Imagine if we could:
…Predict community level dynamics of ecosystems atscales from local to global, based on the ecology andbiology of all individual organisms
Decentralized biodiversity infrastructure
Plants
3,400 Herbaria worldwide
10,000 Associate curators and specialists
350,000,000 specimens in collections
180,000,000 specimens digitized
2,000,000,000 specimens including animals
Source: gbif.org; http://sciweb.nybg.org/science2/IndexHerbariorum.asp
200,000,000+ printed pages1,900,000 species described20,000,000+ species treatments 17,000 new species per year
Biodiversity libraries
BUT: The data are hidden
Incomplete digitization Publications areunstructuredCollections are incompleteData is not linkedMost data are not open
Nationaal Herbarium Nederland collection on GBIF
Source: http://www.gbif.org/dataset/7b33b040-f762-11e1-a439-00145eb45e9a
One collection’s view of the world
Another collection’s view of the world
http://www.gbif.org/dataset/82b0f51c-f762-11e1-a439-00145eb45e9a
What does this mean?
The Linking Open Data cloud diagram
Linked Open Data Cloud
Names as information tags in life sciences
Names
Characteristics
Publications
GenesCollections
Specimens
Distribution
The enhanced and linked treatments, extracted, stored on Plazi.org, and served in
a human readable form, are linked to the underlying data: Fisher & Smith, 2008,
PLoS ONE.
Towards an (European) Open Biodiversity Knowledge Management System
Coordination and Policy Development in Preparation for a European Open Biodiversity Knowledge Management
System
Supported by the European Commission through its FP7 research funding programme
pro-iBiosphere
pro-iBiosphere: Partners
Create digital objects+ Identifiers and resolvers
+ Open Access+ Adequate infrastructure
+ Sustainable and permanent infrastructure+ Reliable services for partners in research projects and society
Seamless Global Virtual Research Knowledge Management System
(European Open Biodiversity Knowledge Management System)
Biodiversity Knowledge Management System
Impact
Support reliable and permanent open access to digital biodiversity recordsCreate identifiers and link biodiversity literature, collections, digital objects, genes, etc.Ensure global interoperability and sharing of biodiversity data, information and knowledgeProvide new services in support of open scienceProvide the ground for modelling biosphereDevelop data policies to harness the potential of open access
European Open Biodiversity Management SystemThe envisaged
will:
Convert data into machine readable data
Literature as an example
Text
<tax:treatment>
<tax:nomenclature>
<tax:name>
<tax:xid source="HNS" identifier="193329"/>
<tax:xmldata>
<dc:Genus>Mystrium</dc:Genus>
<dc:Species>leonie</dc:Species>
</tax:xmldata>
Mystrium leonie
</tax:name> Bohn & Verhaagh
<tax:status>n. sp.</tax:status>
Fig 1 D - F
</tax:nomenclature>
<tax:div type="description">
<tax:p>HOLOTYPE WORKER: TL 3.95, HL 1.02, HW 0.95, CI 93, SL
1.30, SI 137, PW 0.73, ML 0.38. Mandible outer margin strongly curving
to a sharp apical tooth, the apex parallel to the anterior clypeal margin.
(Holotype with material in mandibles, so mandibles and anterior clypeus
$ described below from paratypes.) Median clypeus
....
</treatment>
Enhanced and linked text
Treatment
A publication or section of a publication documenting the
features or distribution of a related group of organisms
(called a “taxon”, plural “taxa”) in ways adhering to highly
formalized conventions.
http://terms.tdwg.org/wiki/tp:taxon-treatment
Catapano, 2010.
Treatment
X-us c-us
(Treatment)
Citation
Description
Mate
X-us b-us
(Treatment)
Citation
Description
Material cit
X-us b-us
n.sp
(Treatment)
Citation
Description
Material cit
X-us b-us
(Treatment)
Citation
Description
Material cit
Treatments
X-us c-us
(Treatment)
Citation
Description
Mate
X-us b-us
(Treatment)
Citation
Description
Material cit
X-us b-us
n.sp
(Treatment)
Citation
Description
Material cit
X-us b-us
(Treatment)
Citation
Description
Mateerial cit
Title
(Article)
Bibliogra-phic references
Title
(Article)
Bibliogra-phic references
Title
(Article)
Bibliogra-phic references
Title
(Article)
Bibliogra-phic references
Systema naturae
(Article)
Bibliogra-phic references
Treatments
References
Treatments can be cited, like publications, with stable identifiers.
http://treatment.plazi.org/id/31F96F41E3E002BD88985A4F3A20E45A
Best practices for stable URIs:
http://wiki.pro-ibiosphere.eu/wiki/Best_practices_for_stable_URIs
Jeremy Miller, Work in Progress
Jeremy Miller, Work in Progress
Names can be linked automatically
Automated registration
MANUSCRIPT SUBMISSION
MANUSCRIPT ACCEPTED
XML Response
ARTICLEPUBLISHED
Taxon name available/valid (effectively published)
XML article metadata
XML Query
Peer review
The enhanced and linked treatments, extracted, stored on Plazi.org, and served in
a human readable form, are linked to the underlying data: Fisher & Smith, 2008,
PLoS ONE.
Penestomus egazini Miller, Haddad & Griswold, 2010
Progress
Treatments (% complete): 4/4 (100%)
Data summary
Specimen records: 41
adult femaleadult maleother
51%
2%
46%
Specimen collections
Institutions: 3
Distribution
Muséum National d'Histoire Naturelle, Paris
California Academy of Sciences, San Francisco
Albany Museum, Grahamstown
2%
5%
76%
20%
CountriesLesotho
South Africa
Georeferenced materials citations
Export species materials citations (DwC)
Export treatment materials citations (DwC)
02000400060008000
100001200014000160001800020000
Materials Citations Records by Researcher
Other
Donat Agosti
David Grimaldi
Toby Schuh
James Carpenter
Norman Platnick
American Museum of Natural History
Data summary
Materials citations 2004-2013: 111,364
Distribution
Georeferenced materials citations
Export species materials citations (DwC)
Ma
teria
ls C
ita
tion
s R
ecord
s
0
500
1000
1500
2000
2500
Materials Citations Records by Institution
Other
Muséum National d'HistoireNaturelle, ParisNatural History Museum,LondonMuseum of ComparativeZoologySmithsonian Institution
American Museum of NaturalHistory
Zootaxa
Data summary
Materials citations 2004-2013: 11,476
Distribution
Georeferenced materials citations
Export species materials citations (DwC)
Ma
teria
ls C
ita
tion
s R
ecord
s
Better:
Create data as machine readable data
Unified marked up final outputTaxon treatments, keys, images, localities
PROSPECTIVE PUBLISHING | HISTORICAL LITERATURE
Legacy and new taxonomic literature
Content management systems &repositories (e.g., Plazi, EOL, GBIF, SCRATCHPADS, EDIT)
TaxPub XML schemaPENSOFT MARK UP tool
Marked up publicationsPDF, HTML and XML
archiving
WIKISpecies-ID, Wikispecies
Wikipedia
Indexing (IPNI, ZooBank, Myco-
Bank, GNA)
Aggregators(EOL, GBIF)
Electronic archives; Data
Centers
END
USERS
TaxonX schema PLAZI’ GOLDEN GATE editor
Automated submission; peer-
review
http://biodiversitydatajournal.com/articles.php?id=995
Access to ant taxonomic publications through antbase.org /Smithsonian Institution, including currently the entire
body of non-copyrighted publications since 1758 (>4,000 publications or 85,000 pages)
Open Access
Knowledge wants to be free
Before antbase.org, Harvard‘s Museum of
Comparative Zoology could claim to be the only
location with a complete set of ant systematics
publications from 1758 - present.
Before antbase.org, Harvard‘s Museum of
Comparative Zoology could claim to be the only
location with a complete set of ant systematics
publications from 1758 - present.
Through antbase.org‘s
digital library, access
to this body of
literature is worldwide,
and it is actively used
(>10,000 visits in one
month only).
Knowledge has to be free
Bouchout Declaration, 2014
Umsetzung durch den Schweizerischen Nationalfonds, 2007
Berlin Declaration, 2003
• The free and open use of content, services and other digital resources about biodiversity;
• Licenses that grant all users a free, irrevocable, world-wide, right to copy, use, distribute, transmit and display the work publicly as well as build on the work and making derivative works, subject to proper attribution consistent with community practices;
• Policy developments that will foster free and open access to biodiversity data;
• Tracking the use of information to ensure that sources and suppliers of data are assigned credit for their contributions;
• An agreed infrastructure, standards and protocols to improve access to and use of open data;
Bouchout Declaration, 2014 (1)
• Registers for content and services to allow discovery, access and use of open data;
• Persistent, dereferenceable identifiers for data objects and physical objects such as specimens, images and taxonomic treatments;
• Linking data using agreed vocabularies, both within and beyond biodiversity, that enable participation in the Linked Open Data Cloud;
• Dialogue coordinated by the leading signatories to refine the concept, priorities and technical requirements of Open Biodiversity Knowledge Management.
• A sustainable Open Biodiversity Knowledge Management that is attentive to scientific, sociological, legal, and financial aspects.
Bouchout Declaration, 2014 (2)
Knowledge has to be made free
You!
Reduce costs – future publishing
Don’t waist money:
Focus on Open Access enhanced linked publications – not pdf only
founded in 2008
Swiss based NGO with members in
Switzerland, Germany, Bulgaria, US and
Iran
research based think tank with the
mission to promote open access to
scientific content
five pillars: Legal advice,
technical innovations and solutions,
maintenance of a treatment repository
and Biowikifarm, consultancy, advocacy
Modify copyright legislation to serve
better the scientific needs
Taxpub TaxonX
DTD Schema
Prospective publications Legacy publications
Constraint loose
Derivative of JATS independent
Self-contained Allows import of other schemas
Plazi workflow: overview
Plazi Search and Retrieval Server: Access to data
Darwin Core-Archive
You
You
You
human
machine
Biowikifarm
founded in 2008
Swiss based NGO with members in
Switzerland, Germany, Bulgaria, US and
Iran
research based think tank with the
mission to promote open access to
scientific content
five pillars: Legal advice,
technical innovations and solutions,
maintenance of a treatment repository
and Biowikifarm, consultancy, advocacy
Plazi GmbH founded in 2012 as
service SME owned by Plazi
research based think tank with the
mission to promote open access to
scientific content
five pillars: Legal advice,
technical innovations and solutions,
maintenance of a treatment repository
and Biowikifarm, consultancy, advocacy
Plazi GmbH founded in 2012 as
service SME owned by Plazi
Funding from public donors, eg. EU,
corporate and private
Funding:
EU
EU-BON
Pro-iBiosphere
Private sector
Inkind
Voluntary work
five pillars: Legal advice, technical
innovations and solutions, maintenance
of a treatment repository and
Biowikifarm, consultancy, advocacy
Plazi GmbH founded in 2012 as
service SME owned by Plazi
Funding from public donors, eg. EU,
corporate and private
Clients are global
Consultancies and Services:
Consulting publishers on how to
produce XML semantically enhanced
output (eg. EJT, Zootaxa, Smithsonian
Institution)
Service to mark-up literature
http://plazi.org
Thank you very much!
Donat Agosti
This project is funded under the European Union's Seventh Framework
Programme (FP7/2007-2013) under grant agreement №312848.