Fourth Annual Summit | Feb 21-23 2014 | Tucson, AZ Scratchpads for community involvement for natural...
If you can't read please download the document
Fourth Annual Summit | Feb 21-23 2014 | Tucson, AZ Scratchpads for community involvement for natural history collections Dr Dimitris Koureas Biodiversity
Fourth Annual Summit | Feb 21-23 2014 | Tucson, AZ Scratchpads
for community involvement for natural history collections Dr
Dimitris Koureas Biodiversity Informatics Group | Department of
Life Sciences Natural History Museum London Fourth Annual Summit |
Feb 21-23 | Tucson, AZ
Slide 2
Fourth Annual Summit | Feb 21-23 2014 | Tucson, AZ Inaccessible
| native format/private silos Disconnected | not aggregated or
discoverable Redundant | overlapping efforts no coordination
Cluttered | small and dispersed datasets 20% 80% The long tail of
Biodiversity data What is we try to tackle? Typically produced by
small communities
Slide 3
Fourth Annual Summit | Feb 21-23 2014 | Tucson, AZ Virtual
Research Environments efficient in incentivising and enabling
researchers to mobilise their data Online collaborative
environments Our goal is to make every researcher digital Can VREs
help? Underlying technologies that help semantically enrich and
aggregate data on a higher level Provide efficient tools, simple
interfaces and comprehensive documentation Incentivise researchers
to enter, share and finally mobilise their datasets
Slide 4
Fourth Annual Summit | Feb 21-23 2014 | Tucson, AZ Enter
Structure Curate Link Publish Biodiversity data online 7 years of
continuing development | 3 major Grants | Industry leading
platform
Slide 5
Fourth Annual Summit | Feb 21-23 2014 | Tucson, AZ 65,000
unique visitors/month Per month unique visitors to Scratchpads
sites 660 Scratchpads Communities by 7,100 active registered users
covering 90,000 taxa in 615,000 pages. In total more than 1,600,000
visitors
Fourth Annual Summit | Feb 21-23 2014 | Tucson, AZ A Scratchpad
is a gateway to big data In-house data External data & services
Biodiversity standards (TDWG, DwC, Audubon)
Slide 8
Fourth Annual Summit | Feb 21-23 2014 | Tucson, AZ Mark-up /
Data annotation Atomisation Collaborate Curate Link Source data
Aggregate Publish Unstructured Overlapping Disconnected Native
formats & vocabularies Controlled vocabularies are key for
efficient data capture
Slide 9
Fourth Annual Summit | Feb 21-23 2014 | Tucson, AZ In order to
capture and annotate data we need Fine grained pre-defined fields
and Comprehensive controlled vocabularies 1. 2.
Slide 10
Fourth Annual Summit | Feb 21-23 2014 | Tucson, AZ Capturing
specimen record data Taxonomic identification Date Collector
Location -Continent -Country -State/Province -GPS -Locality -
Area/place - Habitat - Substrate (Environmental material) - biome
Usually transcribed from label Some inferred by curators Provided
in highly inconsistent way EOL IUCN ISO web service DwC What we
currently use Generating character/trait projects
Morphological/anatomical characters Ecological traits 1.2.
Slide 11
Fourth Annual Summit | Feb 21-23 2014 | Tucson, AZ Users of
Virtual Research Environments consumers as well as contributors to
ontologies Biodiversity communities Communities working on
ontologies Bottom-up approach Top-down approach
Slide 12
Fourth Annual Summit | Feb 21-23 2014 | Tucson, AZ Community
involvement Ontology granularity Top-down approach Bottom-up
approach High level approach Deep hierarchy End-user community
involvement
Slide 13
Fourth Annual Summit | Feb 21-23 2014 | Tucson, AZ Simple and
intuitive end-user products Controlled vocabularies over highly
structured ontologies for data capture Mechanisms for updating
vocabularies based on user custom entries 1. 2. 3.
Slide 14
Fourth Annual Summit | Feb 21-23 2014 | Tucson, AZ Ontologies
as Infrastructure Communal / agreed Persistent Essential Robust
& reliable Ontologies as Research Concerns specific communities
Experimentation Frequent changes Before we can widely implement the
use of ontologies we need to shift from one to another Good for
Knowledge representation and reasoning Good for Data capturing
Slide 15
Fourth Annual Summit | Feb 21-23 2014 | Tucson, AZ Vertical
approach Horizontal approach The e-infrastructures pyramid
Slide 16
Fourth Annual Summit | Feb 21-23 2014 | Tucson, AZ Thank you
Comments/questions? @DimitrisKoureas [email protected]