38
EMBRACE – BioMart Developments & Future Syed Haider Rice Group - EBI July 2008

Haider Embrace Bosc2008

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Haider Embrace Bosc2008

EMBRACE – BioMart Developments & Future

Syed HaiderRice Group - EBIJuly 2008

Page 2: Haider Embrace Bosc2008

EMBRACEwww.embracegrid.info

European Model for Bioinformatics Research and Community Education

Objective:

to integrate the major databases and software

tools in bioinformatics

Page 3: Haider Embrace Bosc2008

A Collaboration:

- European Bioinformatics Institute (EBI)

- Ontario Institute for Cancer Research (OICR)

BioMartwww.biomart.org

Page 4: Haider Embrace Bosc2008

BioMart

A generic data management system with a particular focus on supporting biological research featuring:

- Built-in query optimisation for fast data retrieval- Data Federation- Easy to use interfaces and APIs- Web Services and DAS

Page 5: Haider Embrace Bosc2008

In a nutshell

ATGCTGTTGTGCATGCTGGACTGGATGGCCCGATGGATGCTGTTGTGCATGCTGGACTGGATGGCCCGATGG

Source data(MySQL, Oracle, Postgres)

DB

Mart

Page 6: Haider Embrace Bosc2008

Deploying BioMart

– STEP 1 - Transformation– STEP 2 - Configuration

Page 7: Haider Embrace Bosc2008

1. Transformation

ATGCTGTTGTGCATGCTGGACTGGATGGCCCGATGGATGCTGTTGTGCATGCTGGACTGGATGGCCCGATGG

DB

Mart

Source data(MySQL, Oracle, Postgres)

Page 8: Haider Embrace Bosc2008

1. TransformationMartBuilder

Page 9: Haider Embrace Bosc2008

2. Configuration

Mart

Mart

Mart

Page 10: Haider Embrace Bosc2008

2. ConfigurationMartEditor

Page 11: Haider Embrace Bosc2008

User Interfaces

Page 12: Haider Embrace Bosc2008

Concepts for End Users

1.Dataset

2.Filter

3.Attribute

Page 13: Haider Embrace Bosc2008

Examples of all rat genes

located on chromosome 1, expressed in lungs

name, chromosome, description

of all mouse genes

ENSMUSG00000042351

exon sequences in FASTA format

of all rat genes

up-regulated in brain and associated with a QTL for

a neurological disorder

Upstream sequences

Page 14: Haider Embrace Bosc2008

Web Service Access

<Query>

<Dataset name="hsapiens_gene_ensembl" >

<Filter name="chromosome_name" value="1"/><Attribute name="ensembl_gene_id"/><Attribute name="ensembl_transcript_id"/><Attribute name="biotype"/>

</Dataset> </Query>

wget --post-data 'query=

‘http://www.biomart.org/biomart/martservice

Page 15: Haider Embrace Bosc2008

Web Service Access

<Query>

<Dataset name="hsapiens_gene_ensembl" >

<Filter name="chromosome_name" value="1"/><Attribute name="ensembl_gene_id"/><Attribute name="ensembl_transcript_id"/><Attribute name="biotype"/>

</Dataset> </Query>

wget --post-data 'query=

‘http://www.biomart.org/biomart/martservicemartview

Page 16: Haider Embrace Bosc2008

VIRTUALSCHEMANAME=default

&ATTRIBUTES=hsapiens_gene_ensembl.default.feature_page.

ensembl_gene_id

&FILTERS=hsapiens_gene_ensembl.default.filters.

chromosome_name."1"

Web Service AccessXML Free URL

http://biomart.org/biomart/martview?

Page 17: Haider Embrace Bosc2008

BioMart DAS Access

http://www.YourBioMart.org/biomart/das/DATASET/features? segment=FILTERS

http://www.biomart.org/biomart/das/default__hsapiens_gene_ensembl__ensembl_das_chr/features? segment=1:1,100000

http://www.biomart.org/biomart/das/default__hsapiens_gene_ensembl__ensembl_das_gene/features? segment=ENSG00000197194

Page 18: Haider Embrace Bosc2008

Web based AccessHow far it has gone ?

Page 19: Haider Embrace Bosc2008

Taverna

Page 20: Haider Embrace Bosc2008

BiomaRt - BioConductor package

Page 21: Haider Embrace Bosc2008

Cytoscape

Page 22: Haider Embrace Bosc2008

Galaxy

Page 23: Haider Embrace Bosc2008

Template Queries

Page 24: Haider Embrace Bosc2008
Page 25: Haider Embrace Bosc2008
Page 26: Haider Embrace Bosc2008

Learn as you go....

Show URL Request

Show XML Query

Show Perl Script

Page 27: Haider Embrace Bosc2008

- Scalability - Maintaining large databases and configurations

- Security- UserName/Password based access for clinical and

experimental data etc

- Multiple and Custom GUIs

Future

Page 28: Haider Embrace Bosc2008

- Beyond rows and columns- Framework for Visualisations and Analysis Tools

Future

Page 29: Haider Embrace Bosc2008

Visualisation: Gene List Analysis & Clinical Significance

Gene List

Query

Visualisation

Gene list analysis

Clinical SignificanceResponse to

therapy

Map genes onto genome

Map genes onto GO

Map genes onto Pathways

Page 30: Haider Embrace Bosc2008

Map Genes onto Genome

Page 31: Haider Embrace Bosc2008

Visualisation: Gene List Analysis & Clinical Significance

Gene List

Query

Visualisation

Gene list analysis

Clinical SignificanceResponse to

therapy

Map genes onto genome

Map genes onto GO

Map genes onto Pathways

Page 32: Haider Embrace Bosc2008

Map Genes onto GO

GO

Biological process (32)

Cellular component (18)

Molecular Function (24)

Stem cell maintenance (7)

Positive regulation of developmental process (8)

Leukocyte mediated cytotoxicity (5)

regulation of cell killing (12)

Developmental process (15)

Cell killing (17)

Page 33: Haider Embrace Bosc2008

Visualisation: Gene List Analysis & Clinical Significance

Gene List

Query

Visualisation

Gene list analysis

Clinical SignificanceResponse to

therapy

Map genes onto genome

Map genes onto GO

Map genes onto Pathways

Page 34: Haider Embrace Bosc2008

Map Genes onto Pathways

Reactome

Apoptosis (43)

Intrinsic pathway for apoptosis (26)

Signaling by Wnt (10)

Signaling by TGFβ (23)

Activation of BH3-only proteins (5)

Permeabilization of mitochondria (3)

Release of apoptotic factors from mitochondria (18)

Page 35: Haider Embrace Bosc2008

Future- Summary Pages

Annotation for each gene

1. Entrez/Ensembl gene info

2. Gene ontology/pathways

3. Biblography

4. Transcript & protein info, etc.

Genomic variations for each gene

1. for each cancer studied

Information for each patient

1. Demographics

2. History of cancer

3. Progress & outcome

4. Types of samples available

5. Histopathology of tumor

Submission support

Page 36: Haider Embrace Bosc2008

Galaxy

BioMart TeamArek Kasprzyk (OICR-Toronto)

Syed Haider (Rice Group-EBI)

AcknowledgementsBenoit Ballester (Ensembl) Richard Holland (Ensembl)

Andreas Kahari (Ensembl) Craig Melsopp (Ensembl)

Damian Smedley (Ensembl) Arne Stabenau (Ensembl)

Asif Kibria (EBI) Gulam Patel (EBI)

Stephen Robinson (EBI) Katerina Tzouvara (EBI)

Will Spooner (CSHL) Gudmundur Thorisson (CSHL)

Darin London (Duke University) Don Gilbert (Indiana University)

Steffen Durinck (NCI NIH) Eric Just (Northwestern University)

Paul Donlon (Unilever) Christina Yung (OICR)

Igor Antoshechkin (Caltech)

Credits

References

Page 37: Haider Embrace Bosc2008

Thanks.

Page 38: Haider Embrace Bosc2008

BioMart Central Portal – queries served

Sept'07 Oct'07 Nov'07 Dec'07 Jan'08 Feb'08 Mar'080

200000

400000

600000

800000

1000000

1200000

1400000

1600000