Reactions to the Open Spectral Database

  • View
    202

  • Download
    2

  • Category

    Science

Preview:

Citation preview

Reactions toThe Open Spectral Database

http://osdb.info

Stuart J. Chalk, Department of ChemistryUniversity of North Florida

schalk@unf.edu

Instigator: Tony Williams

SCTY 28 – Pacifichem 2015

What would Jean-Claude Bradley have wanted?

Share and Reuse Research Data!

How Do You Make Everything Open?

JCAMP Implementation

The Open Spectral Database

Data Model

Live Demo (fingers crossed)

Future Plans

Conclusion

Outline

What Would JCB Have Wanted?

Simple: Openness as the norm not the exception

Data made available, without restriction, so its useful

Mechanisms/tools to make data available

Formats to allow others to get the data…

…but also so its easy to use

Annotated data to make it easy to find

Community driven promotion of and action on these issues

Ryan P. Womack (2015) Research Data in Core Journals in Biology, Chemistry, Mathematics, and Physics. PLoS ONE 10(12): e0143460. doi:10.1371/journal.pone.0143460

Share and Reuse Research Data!

You have to know/define what “everything” means

Open Data

Open Data Model

Open and useable data structures

Open Code

Open to input from the community on all aspects

Open to add, extend, change, and rethink all of this

How Do You Make Everything Open?

Spectral data – There are many formats but only one open and generally accepted standard – JCAMP

Its not perfect…

…but its an output format people can share

Lets export the data, metadata, and inference as much as possible from JCAMP files

Not as easy as it seems…

First Attempt

Great data exchange format, however…

…not meant to be computer input…

…more a way to get data out so a human can process

Missing parameters (metadata)

Missing data

Incorrect values

Extra data

Incorrectly compressed

Challenges with JCAMP

Upload JCAMP spectra

Data and metadata extracted

Organize metadata so it can be used to find data

Use REST based website and API to make data availableand allow searching – document API

Make the website available as a project on GitHub andinvite the community to get involved

The Open Spectral Database

Apache 2.4 (http://httpd.apache.org) PHP 5.6 (http://www.php.org) CakePHP 2.7 PHP Framework (http://cakephp.org) MySQL 5.5 (http://www.mysql.com) jQuery (JavaScript) (http://jquery.com) Flot for jQuery (http://www.flotcharts.org) Jsmol (http://jmol.org) Bootstrap CSS (http://getbootstrap)

eXtensible Markup Language (http://www.w3.org/TR/xml/) JavaScript Object Notation (JSON) (http://json.org) JSON for Linked Data (JSON-LD) (http://www.w3.org/TR/json-ld/)

Technology

JCAMP file is imported into PHP as an array, then

Clean

Uncomment ($$)

Separate

Labeled Data Records (LDRs)

Parameters (##.)

User Defined Labels (##$)

Validate

Standardize

Decompress

Convert to output format or store in database

Ingestion Process

In order to organize the data and metadata it is distributed across a number of tables in the database

This is a generic science data model that is being used for multiple projects

Not limited to spectra or even just Chemistry data

Data Model

Data Model

File upload

Export formats

Search API

Live Demo

SemanticAnnotation

Enthusiastic Feedback with constructive comments…

Spectral list is boring needs molecules linked to spectra

Less metadata on the spectral page with option to see more

Revise homepage to make it more inviting

Reactions to Alpha Version

Again Enthusiastic…

”Love the layout! Very clean…”

“Nice Work!” (Twitter comment)

… with constructive comments

Needs a zoom spectra feature

Clicking on spectrum provides data that is not useful

Maybe you could use JSpecView rather than Flot?

Reactions to Beta Version

Handle more complicated JCAMP files

Handle file formats other than JCAMP

Export in AnIML format

Expand the API

Improve Flot viewer functionality (e.g. zoom)

Add JSpecView spectral viewer

Endpoint summary page

Document the website (GitHub)

Document how to contribute to the website (GitHub)

Solicit feature requests and encourage contributions

Things To Do

Take Home

The OSD is open for the community to develop and implement ideas about open spectral data re:

Data Model

API features

Export Formats

Services

Community Involvement!

Use as a data source for other applications

Submission of feature requests

Participation as code contributor

schalk@unf.edu

Phone: 904-620-5311

Skype: stuartchalk

Twitter: @StuChalk

LinkedIn/Slidehare: https://www.linkedin.com/in/stuchalk

ORCID: http://orcid.org/0000-0002-0703-7776

ResearcherID: http://www.researcherid.com/rid/D-8577-2013

Questions?

Recommended