Lowering barriers to publishing biological data on the web

Preview:

DESCRIPTION

Short 10 minute talk encouraging bioinformatics programmers to organize and reuse code targeted at making data easily available on the web. Current open source technologies are combined into a higher level framework. An example implementation using Google App Engine and existing bioinformatics libraries is presented.

Citation preview

Lowering barriers topublishing biological data on

the web

Brad Chapman

Department of Molecular BiologyMassachusetts General Hospital

Boston, MA USAchapmanb@50mail.com

http://friendfeed.com/chapmanb

27 June 2009

Motivation

Motivation

I Web accessible

I Interoperable in standard formats

I Displays for browsing

I Analyses

I Scale

Current state: Reusable libraries

I Parse file formats

I Run programs

I Build analysis pipelines

I Communities

Python examples

I Biopython

I bx-python

I pygr

I PyCogent

Current state: Database schemas

I Represent biological data

I Expand analyses beyond flat files

I Interoperate with standards

BioSQL Chado

Current state: Web applications

Faster and Bigger

Proposal

I ProvideI Reusable presentation componentsI Quickly deployable frameworks

I IntegrateI Bioinformatics librariesI Database schemasI Web development frameworks

Proposal

http://biosqlweb.appspot.com/

Challenges: Design

I ReusableI Components: avoid large frameworkI Multi-language: javascript front end

I AccessibleI Automated data retrieval (REST)I Standard formats (GFF, RDF)

I AvailableI Creative Commons

http://creativecommons.org/about/licenses

I Open Data Commonshttp://www.opendatacommons.org/licenses/

Challenges: Community questions

How do we. . .

I provide plug-in components?

I leverage existing code?

I make reuse easier?

I communicate about these issues?

Recommended