An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

Preview:

Citation preview

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 1/32

An exemplar for data integration in the biomedicaldomain driven by the ISA framework 

Shannan Ho SuiAMIA Summits on Translational Bioinformatics

March 19, 2013

http://stemcellcommons.org

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 2/32

This is a story aboutcollaboration...

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 3/32

ISA

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 4/32

ISA

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 5/32

• Inconsistent data formats, experimental

descriptions and results

Disparate Stem Cell Resources

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 6/32

Disparate Stem Cell Resources

• Inconsistent data formats, experimental

descriptions and results

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 7/32

The Stem Cell Commons

• A shared data andanalytical resource

• Bioinformatics support

for research at the HSCI

• A community

Datarepository

Analysissystem

Support/ 

consults

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 8/32

The Stem Cell Commons

• A shared data andanalytical resource

• Bioinformatics support

for research at the HSCI

• A community

Datarepository

Analysissystem

Support/ 

consults

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 9/32

Susanna-Assunta Sansoneisacommons.org

user community

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 10/32

General-purpose, configurable format,designed to support the use of severalstandards  checklists, terminologies andconversions to (a growing number of) otherme t ad a t a formats , used by publ ic

repositories, e.g.

MAGE-Tab

SRA-xmlSOFT

Pride-xml

Susanna-Assunta Sansoneisacommons.org

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 11/32

Rationale for developing ISA

§ Capture all salient features of the experimental workflow

§ Make annotation explicit anddiscoverable

§ Support data provenancetracking

§ Use community standards

Susanna-Assunta Sansoneisacommons.org

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 12/32

ISA

Manual merging process

53 studies

1098 assays

87 studies

1179 assays

Curator

148 studies

2356 assays

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 13/32

ISA

Conversion driven by ISA-Tab

53 studies

1098 assays

87 studies

1179 assays

ISA-Tab

148 studies

2356 assays

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 14/32

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 15/32Data uploads and annotation

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 16/32

Current Data Statistics

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 17/32Filtering data using metadata as search facets

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 18/32

Experiment description

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 19/32

Experimental protocols and data downloads

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 20/32

ISA-Tab metadata downloads and export

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 21/32

Linking data to the Galaxy

workflow engine

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 22/32

Refinery:An analysis and visualization framework

In development

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 23/32

Viewing and selecting samples in list view

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 24/32

Viewing and selecting samples in matrix view

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 25/32

Initiating workflows

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 26/32

Monitoring progress

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 27/32

Integration with the IGV genome browser

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 28/32

Challenges

• Changing research culture(s) to recognize the valueof data sharing

• Manually curating the data for consistency and

completeness

• Managing large volumes of data

• Standardizing workflows

• Ensuring interoperability when integrating multiplesystems and tools

• Technical complexity of software development effort

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 29/32

Refinery

Psalm HaseleyNils Gehlenborg Richard Park Ilya SytchevPeter Park Shannan Ho Sui

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 30/32

ISA Commons

Philippe Rocca-Sera

Eamonn MaguireSusanna Sansone

Oxford e-Research Centre

A growing community that uses the ISA metadata trackingframework to facilitate standards-compliant collection, curation,

managementand reuse of datasets.

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 31/32

WikiPathways

7/28/2019 An Exemplar for Data Integration in the Biomedical Domain Driven by the ISA Framework

http://slidepdf.com/reader/full/an-exemplar-for-data-integration-in-the-biomedical-domain-driven-by-the-isa 32/32

Meet the Team

Center for Stem Cell Bioinformatics

Winston HideProgram Leader

Shannan Ho SuiAnalytics

Oliver HofmannCore services

Ilya SytchevBioinformatics Developer

John HutchinsonHSCI Analyst

Sudeshna DasRepository

Stéphane CorlosquetBioinformatics Engineer

Emily MerrillBioinformatics Analyst

• Nils Gehlenborg

• Richard Park

• Psalm Haseley

• Peter Park

Collaborators

• Eamonn Maguire

• Philippe Rocca-Sera

• Susanna Sansone

Recommended