34
| [email protected] DAM & BOSC 2009 | 1 MOLGENIS and the eXtensible Genotype And Phenotype database project (xgap) Morris A. Swertz et al DAM & BOSC sigs Stockholm, June 27 2009 EBI Biobanking platform

Swertz Molgenis Bosc2009

  • Upload
    bosc

  • View
    656

  • Download
    0

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Swertz Molgenis Bosc2009

|[email protected] & BOSC 2009 | 1

MOLGENIS and the eXtensible Genotype And Phenotype database project (xgap)

Morris A. Swertz et al

DAM & BOSC sigsStockholm, June 27 2009

EBIBiobanking platform

Page 2: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 |

Outline2

› MOLGENIS database generator- Free toolbox of automated best practices auot-

generate useful data apps (sql,java,R,soap) from simple models

- As open platform that harmonizes data syntax, programming interfaces, user interaction, pluggable

› Demo

› eXtensible Genotype And Phenotype (xgap)- To store various high-throughput genotype and

phenotypes harmonized- as platform for collaboration and analysis tools

› Current work

Page 3: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 |

MOLGENIS, why and how 3

inbreed

100

10.000

1,000,000

100,000

10,000

10

10,000,00

QTL profiles

network

correlate

genomestrains

individuals

markers

expressions

preprocess

probesmicroarrays

100

hybridize

100,000

genotype genotypes

norm exprs.

map

biologistbiologist

biological challenges

Page 4: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 |

MOLGENIS, why and how 4

inbreed

100

10.000

1,000,000

100,000

10,000

10

10,000,00

QTL profiles

network

correlate

genomestrains

individuals

markers

expressions

preprocess

probesmicroarrays

100

hybridize

100,000

genotype genotypes

norm exprs.

map

biologistbiologist

bioinformatician softw engineers

biological challenges suitable infrastructure

Page 5: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 |

MOLGENIS, why and how5

biologistbiologist

bioinformatician softw engineers

arab 220903

100 200 300 400 500 600 700 800 900 1000m /z0

100

%

Koornneef0007 526 (11.117) AM (Top,4, Ar,10000.0,556.28,0.70,LS 10); Sm (Mn, 2x1.00); Sb (1,40.00 )1.40e3171.1702

1396

649.3804551

526.3066248172.1795

162

650.3882224

809.4496;80

inbreedinbreed

100

100.000

10,000,000

1000

10,000

10

1000genotypegenotypeindividualsindividuals

mass peaks

mass peaks

genotypesgenotypes QTL profiles

QTL profiles

strainsstrains

networknetwork

SNP arraysSNP arrays

correlatecorrelate

LC/MSLC/MS

genomegenome

mapmap

preprocesspreprocess aligned peaks

aligned peaks

Reinventing wheels,Wasting timeHard to integrate

Reinventing wheels,Wasting timeHard to integrate

biological challenges suitable infrastructure

Page 6: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 |

Alternative strategy 6

http://www.molgenis.orgSwertz & Jansen (2007) Nature Reviews Genetics 8, 235-243

http://www.molgenis.org

Page 7: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 |

MOLGENIS, why and how7

Platform and generatorsPlatform and generatorsLittle language<!-- entity organization --><entity name="Experiment" label="Experiment"> <field name="ExperimentID" key="1“ readonly="true" label="ExperimentID(autonum)"/> <field name="Medium" type="xref" xref_field="Medium.name"/> /> <field name="Protocol" label="Experiment Protocol"/> <field name="Temperature" type="int"

Little language<!-- entity organization --><entity name="Experiment" label="Experiment"> <field name="ExperimentID" key="1“ readonly="true" label="ExperimentID(autonum)"/> <field name="Medium" type="xref" xref_field="Medium.name"/> /> <field name="Protocol" label="Experiment Protocol"/> <field name="Temperature" type="int"

Blueprint model<!-- entity organization --><entity name="Experiment" label="Experiment"> <field name="ExperimentID" key="1“ readonly="true" label="ExperimentID(autonum)"/> <field name="Medium" type="xref" xref_field="Medium.name"/> /> <field name="Protocol" label="Experiment Protocol"/> <field name="Temperature" type="int"

Blueprint model<!-- entity organization --><entity name="Experiment" label="Experiment"> <field name="ExperimentID" key="1“ readonly="true" label="ExperimentID(autonum)"/> <field name="Medium" type="xref" xref_field="Medium.name"/> /> <field name="Protocol" label="Experiment Protocol"/> <field name="Temperature" type="int" +

bioinformatician softw engineer

inbreed

100

10.000

1,000,000

100,000

10,000

10

10,000,00

QTL profiles

network

correlate

genomestrains

individuals

markers

expressions

preprocess

probesmicroarrays

100

hybridize

100,000

genotype genotypes

norm exprs.

10.000

map

biologistbiologist

http://www.molgenis.orgSwertz & Jansen (2007) Nature Reviews Genetics 8, 235-243

Page 8: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 | 8

Platform and generatorsPlatform and generatorsLittle language<!-- entity organization --><entity name="Experiment" label="Experiment"> <field name="ExperimentID" key="1“ readonly="true" label="ExperimentID(autonum)"/> <field name="Medium" type="xref" xref_field="Medium.name"/> /> <field name="Protocol" label="Experiment Protocol"/> <field name="Temperature" type="int"

Little language<!-- entity organization --><entity name="Experiment" label="Experiment"> <field name="ExperimentID" key="1“ readonly="true" label="ExperimentID(autonum)"/> <field name="Medium" type="xref" xref_field="Medium.name"/> /> <field name="Protocol" label="Experiment Protocol"/> <field name="Temperature" type="int"

Blueprint model<!-- entity organization --><entity name="Experiment" label="Experiment"> <field name="ExperimentID" key="1“ readonly="true" label="ExperimentID(autonum)"/> <field name="Medium" type="xref" xref_field="Medium.name"/> /> <field name="Protocol" label="Experiment Protocol"/> <field name="Temperature" type="int"

Blueprint model<!-- entity organization --><entity name="Experiment" label="Experiment"> <field name="ExperimentID" key="1“ readonly="true" label="ExperimentID(autonum)"/> <field name="Medium" type="xref" xref_field="Medium.name"/> /> <field name="Protocol" label="Experiment Protocol"/> <field name="Temperature" type="int"

+

Upgrade to new researchbioinformatician softw engineer

biologist biologist

arab 220903

100 200 300 400 500 600 700 800 900 1000m /z0

100

%

Koornneef0007 526 (11.117) AM (Top,4, Ar,10000.0,556.28,0.70,LS 10); Sm (Mn, 2x1.00); Sb (1,40.00 )1.40e3171.1702

1396

649.3804551

526.3066248172.1795

162

650.3882224

809.4496;80

inbreedinbreed

100

100.000

10,000,000

1000

10,000

10

1000genotypegenotypeindividualsindividuals

mass peaks

mass peaks

genotypesgenotypes QTL profiles

QTL profiles

strainsstrains

networknetwork

SNP arraysSNP arrays

correlatecorrelate

LC/MSLC/MS

genomegenome

mapmap

preprocesspreprocess aligned peaks

aligned peaks

New Biology

New Biology

http://www.molgenis.orgSwertz & Jansen (2007) Nature Reviews Genetics 8, 235-243

Page 9: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 |

Upgrade to new software tools9

Platform and software generators

Platform and software generatorsLittle language

<!-- entity organization --><entity name="Experiment" label="Experiment"> <field name="ExperimentID" key="1“ readonly="true" label="ExperimentID(autonum)"/> <field name="Medium" type="xref" xref_field="Medium.name"/> /> <field name="Protocol" label="Experiment Protocol"/> <field name="Temperature" type="int"

Little language<!-- entity organization --><entity name="Experiment" label="Experiment"> <field name="ExperimentID" key="1“ readonly="true" label="ExperimentID(autonum)"/> <field name="Medium" type="xref" xref_field="Medium.name"/> /> <field name="Protocol" label="Experiment Protocol"/> <field name="Temperature" type="int"

Blueprint model<!-- entity organization --><entity name="Experiment" label="Experiment"> <field name="ExperimentID" key="1“ readonly="true" label="ExperimentID(autonum)"/> <field name="Medium" type="xref" xref_field="Medium.name"/> /> <field name="Protocol" label="Experiment Protocol"/> <field name="Temperature" type="int"

Blueprint model<!-- entity organization --><entity name="Experiment" label="Experiment"> <field name="ExperimentID" key="1“ readonly="true" label="ExperimentID(autonum)"/> <field name="Medium" type="xref" xref_field="Medium.name"/> /> <field name="Protocol" label="Experiment Protocol"/> <field name="Temperature" type="int"

+

bioinformatician softw engineer

biologistbiologist

arab 220903

100 200 300 400 500 600 700 800 900 1000m /z0

100

%

Koornneef0007 526 (11.117) AM (Top,4, Ar,10000.0,556.28,0.70,LS 10); Sm (Mn, 2x1.00); Sb (1,40.00 )1.40e3171.1702

1396

649.3804551

526.3066248172.1795

162

650.3882224

809.4496;80

inbreed

100

100.000

10,000,000

1000

10,000

10

1000

genotypeindividuals

mass peaks

genotypes QTL profiles

strains

network

SNP arrays

correlate

LC/MS

genome

map

preprocess aligned peaks

http://www.molgenis.orgSwertz & Jansen (2007) Nature Reviews Genetics 8, 235-243

Page 10: Swertz Molgenis Bosc2009

|[email protected] & BOSC 2009 |

Demo

10

Page 11: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 | [email protected] / 11

Step 1: model*

Assay

ID : autoidName : varchar

ID : autoidValue : object

Data

Column

1

Assay 1Row 1

ID : autoidName : varchar

Experiment

Experiment 1Experiment1

ID : autoidName : varchar

Trait

ID : autoidName : varchar

Subject

Experiment1

individualsindividuals

expressionsexpressions

probesprobes

*Can also extract automatically from an existing database

Page 12: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 | [email protected] / 12Assay

ID : autoidName : varchar

ID : autoidValue : object

Data

Column

1

Assay 1Row 1

ID : autoidName : varchar

Experiment

Experiment 1Experiment1

ID : autoidName : varchar

Trait

ID : autoidName : varchar

Subject

Experiment1

Page 13: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 |

Step 2: generate13

Download and customize...

Model file XML

Generate

APIs in Java, R, Web services and HTTP

MyScriptPlugins

FormGen

MenuGenTreeGen

PluginGenMatrixGen

JTypeGenJDBCMapGen

JListGenJReadCsvGen

HSQLGen

JDatabaseGen

MySQLGen

RMatrixGen

WSGen

RListGen

datainfrastructure

user interactioninfrastructure

Communicationinfrastructure

Page 14: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 |

Step 3: use result

› Lets see14

Page 15: Swertz Molgenis Bosc2009

|Date 24.06.2009

eXtensible Genotype And Phenotype database for QTL and GWAS experiments

15

Page 16: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 |

Locus Specific database

Clin. Trial metabase

NextGen sequencing

Proteo/Metabolomics

Animal Observations

Example projects16

Page 17: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 |

XGAP - DAM ChallengesChallenges:

› Share data between QTL collaborators

› Variety of species/methods

› Reuse the ad-hoc analysis protocols

Aim:› Simple common data

model and format

› Common interaction layers (R, SOAP)

› Platform for reusable protocols/tools

› Reuse between individual projects

17

inbreedinbreed

100

10.000

1,000,000

100,000

10,000

10

10,000,00

QTL profiles

QTL profiles

networknetwork

correlatecorrelate

genomegenomestrainsstrains

individualsindividuals

markersmarkers

expressionsexpressions preprocesspreprocess

probesprobesmicroarraysmicroarrays

100

hybridizehybridize

100,000

genotypegenotype genotypesgenotypes

norm exprs.norm exprs.

10.000

mapmap

Main work flow

Data dependency

Biomaterial/result

Lab/analysis process

Scale of information

Associated data files

processprocess

materialmaterial

10,000

Page 18: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 | [email protected] / 18

researcher

First objective

researcher

annotations

Raw and processed data

database

my GaP

Page 19: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 |

Genotype data

1. Data model 19

BxD1 BxD2 BxD3 BxD4 BxD5 BxD6 BxD7rs13475697 1 1 0 1 0 1 0rs13475698 1 0 0 0 0 0 1rs13475699 0 0 0 1 0 1 1rs13475700 1 1 1 1 0 1 0rs13475701 1 0 1 0 0 1 1rs2228909 1 1 0 1 0 0 0rs2228910 0 0 1 1 0 0 0rs3022775 0 0 0 1 1 0 1rs3024102 1 0 1 0 0 0 0rs3024103 1 0 0 1 0 0 0rs3024104 0 1 0 0 0 0 0rs3024105 0 0 1 0 0 0 1rs30462182 1 0 0 0 0 0 0rs30522279 0 1 0 0 1 0 0

MARKERS

Subjects: STRAINS

DATA ELEMENTS

Traits:

TRAIT SUBJECTTRAIT SUBJECT

Looking at standards and existing data setsSimple enough for everybody to create

Page 20: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 | /20

1. Data model

What about QTL data?

rs13475697rs13475698rs13475699rs13475700rs13475701rs2228909 rs22289101415670_at 0,981848 0,293227 0,034092 0,360978 0,298958 0,466545 0,3703691415671_at 0,464346 0,817348 0,990231 0,204923 0,353808 0,668164 0,4493541415672_at 0,243834 0,900083 0,69971 0,217804 0,471408 0,701617 0,0266091415673_at 0,712543 0,001536 0,209082 0,196611 0,191452 0,91619 0,5356591415674_a_at 0,159777 0,101577 0,678902 0,233476 0,251812 0,349968 0,5671711415675_at 0,777691 0,371057 0,670919 0,410665 0,742277 0,142381 0,5409451415676_a_at 0,320175 0,358505 0,207274 0,952688 0,615915 0,07167 0,2258231415677_at 0,840063 0,281845 0,773908 0,396397 0,482995 0,56668 0,199461415678_at 0,880974 0,471662 0,906012 0,711181 0,622078 0,575441 0,8688161415679_at 0,164846 0,957785 0,794479 0,207902 0,091649 0,727786 0,7960581415680_at 0,56679 0,823206 0,321578 0,513087 0,593739 0,272818 0,6208171415681_at 0,215698 0,384919 0,691254 0,550108 0,603988 0,110792 0,3801261415682_at 0,45273 0,36089 0,733234 0,911573 0,549316 0,086473 0,6396251415683_at 0,526019 0,740045 0,955297 0,797566 0,149079 0,370645 0,57789

PROBES

Traits: MARKERS

Traits:

Page 21: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 | / 21

DATA (matrix)

1. Model

TRAIT

SUBJECT

DATA ELEMENT

TRAIT SUBJECTTRAIT SUBJECT

Page 22: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 | / 22

DATA ELEMENT

1. Model

TRAIT

SUBJECT

DATA ELEMENT columns

rows

dimension ELEMENT

Annotations:• Individual,• Strain,• Sample,• …

Annotations:• Individual,• Strain,• Sample,• …

Annotations• Phenotype• Probe• Marker• Mass Peak• …

Annotations• Phenotype• Probe• Marker• Mass Peak• …

Data:• Phenotype

Values• Raw• QTLs• other

Data:• Phenotype

Values• Raw• QTLs• other

Page 23: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 | 23

DATA ELEMENT

Extensions for new experiments

TRAIT

SUBJECT

columns

rows

dimension ELEMENT

PROBE-Name-Gene-Chromosme-Locus

PROBE-Name-Gene-Chromosme-Locus

MARKER-Name-Allele-Chromosme-Locus

MARKER-Name-Allele-Chromosme-Locus

MASSPEAK-Name-MZ-RetentionTime

MASSPEAK-Name-MZ-RetentionTime

Panel-Name-Type: CSS, RIL..-Parent Panels

Panel-Name-Type: CSS, RIL..-Parent Panels

INDIVIDUAL-Name-Strain-Mother-Father-Sex

INDIVIDUAL-Name-Strain-Mother-Father-Sex

SAMPLE-Name-Individual-Tissue

SAMPLE-Name-Individual-Tissue And so on

And so on…

And so on…And so on…

Page 24: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 | / 24

Protocol graph from FuGE

› XGAP extends on standard FuGE (Jones ea, NatBiot 2007)

FuGE: Jones et al Nature Biotech 25, 1127-1133

DATA DATA

Genotype data QTL data

QTL Mapping

AffyArray

SNPArray

DATA

Expression data

MappingProtocol

Illumina

RSoftware

IlluminaProtocol

Affy M430Protocol

BeadStudio

DATA

application

Protocol

Software

Equipment

BioconductorNorm.

Affy M430platform

DATA DATA

DATA

FuGE:

Page 25: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 | [email protected] / 25

UML: XGAP extends FuGE

INVESTIGATION (Core)

SUBJECT VARIANTS (Extensions)

TRAIT VARIANTS (Extensions)

DATA (Core)

ID : autoidInvestigation : xrefName : stringactivityDate : DateTimeInputData : mrefOutputData : mrefProtocol : xref

ProtocolApplication*

Trait

Alleles : stringGene : xref

Marker

Species : OntologyTermChromosome : varchar(5)BPstart : decimalBPend : decimalcMPosition : decimalSequence : text

Locus

Gene : xref

ProbeSet

Control : bool

Gene

Protocol : xref

ProtocolElement

Y : intX : intGridX : intGridY : int

Spot

Species : OntologyTerm

Subject

Type : [Natural,RI,RCC,CSS,..]FounderStrains : mref

StrainStrain

0..1

RetentionTime : decimalmz : decimal

MassPeak

Mass : decimalFormula : stringStructure : string

MetaboliteProtocol

0..1

Data*

ID : autoidInvestigation : xrefName : string

ID : autoidDataSet : xrefColumn : xrefRow : xrefValue : object

DataElement

Column

DataSet

Protocol

Strain : xrefMother : xrefFather : xrefSex : [male,female,unknown,malefemale]

Individual

Individual

0..1

Father0..1 Mother

0..1

PairedSample

Individual2 : xrefLabel : OntologyTermLabel2 : OntologyTerm

Row

ID : autoidName : stringStart : DateTimeEnd : DateTimeProviders : mref

Investigation*

Investigation

Investigation

ID : autoidInvestigation : xrefType : Item.TypeName : string

DimensionElement*

Individual2

MisMatch : boolProbeSet : xrefGene : xref

Probe

Investigation

B

A

C

ID : autoidName : stringProtocolText : text

Protocol*

OutputData*

*

InputData

* *

-FounderStrains

0..*

*

Sample

Tissue : stringIndividual : xref

Unit : OntologyTerm

Phenotype

Gene : xrefSequence : textMass : decimal

Protein

Uniform core to ease sharing of data and tools

Various traits for new research

Various subjects for new research

?

?

Page 26: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 |

2. Mode, run MOLGENIS26

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE molgenis PUBLIC "MOLGENIS 1.0" "http://molgenis.sourceforge.net/dtd/molgenis_v_1_0.dtd"><molgenis name="xgap" label="XGAP - eXtensible Genotype and Phenotype database">

<!-- INVESTIGATION --><module name="xgap.core">

<description>Core entities.</description><entity name="Investigation" extends="FugeInvestigation">

<unique fields="name" description="Name is unique" /></entity><entity name="ProtocolApplication"

extends="FugeProtocolApplication"><field name="Status" type="enum"

enum_options="[inprocess, final]" default="inprocess"description="The status of this protocolapplication (inprocess = still working on it, final = ready for further analysis)."/>

<field name="Investigation" type="xref"xref_entity="Investigation" xref_field="id"xref_label="name"description="Reference to the Investigation this protocolapplication belongs to."/>

<unique fields="name,Investigation"description="Name is unique within an Investigation" />

</entity><!-- DATA --><entity name="Data" extends="FugeData">

<description>Generic structure for describing data matrices such asgenotype result, gene expression measurement, QTLcalculation, etc.

</description><field name="Investigation" type="xref"

xref_entity="Investigation" xref_field="id"xref_label="name"description="Reference to the Investigation this data is measured as part of."/>

<!--field name="DataType" type="xref"xref_entity="DataType" xref_field="id"xref_label="name" description="Added to distinguish betweenqtl and raw data etc." /-->

<field name="RowType" type="enum"enum_options="[Marker,Probe,ProbeSet,Individual,Sample,PairedSample,MassPeak,Gene,Trait,Subject,Strain,Metabolite,Spot,Phenotype,NMRBin]"description="Type of the columns of this matrix. Each column refers to a Trait or Subject (DimensionElement). "/>

<field name="ColType" type="enum"enum_options="[Marker,Probe,ProbeSet,Individual,Sample,PairedSample,MassPeak,Gene,Trait,Subject,Strain,Metabolite,Spot,Phenotype,NMRBin]"description="Type of the rows of this matrix. Each row refers to a Trait or Subject (DimensionElement)"/>

<field name="ValueType" type="enum"enum_options="[Text,Decimal]"description="Type of the values of this matrix. E.g. text strings or decimal numbers." />

<field name="TotalRows" type="int" default="0"/><field name="TotalCols" type="int" default="0"/><unique fields="name,Investigation" />

</entity><entity name="DimensionElement"

extends="FugeDimensionElement"><description>

Describes the biological material or subject which isbeing 'measured' by an Data set.<br />For example an 'Sample' extends from Item, which makesit possible that a microarray-assay Data set such sample(as DataElement can reference any Item).<br />An DimensionElement is always linked to a single oneInvestigation.

</description><field name="Investigation" type="xref_single"

xref_entity="Investigation" xref_field=

Page 27: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 |27

Connect to R statistics

Connect to R statistics

Workflow ready web-services

Workflow ready web-services

UML documentation of your model

UML documentation of your model

Edit & trace your data

Edit & trace your data

Import/export to Excel

Import/export to Excel

plugin your own scripts (R/QTL)

plugin your own scripts (R/QTL)

Tech keywords: object oriented data models, multi-platform java, tomcat/glassfish web server, mysql/postgresql database, Eclipse/Netbeans IDE, Java API, WSDL/SOAP API, R-project API, MVC, freemarker templates and css for custom layout, open source.

m<-find.markers()544 markers downloaded.…library(qtl)#qtl analysis here

add.data(qtl, name = “QTLs”)2,448,000 data elements added.

s tra in .tx tspec ies .tx tp ro toco l.tx tp robe .tx tm arker.tx tinvestiga tion .tx tind iv idua l.tx tgene .tx tda ta .tx tconstan t.p ropertiesda ta

Page 28: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 |

Proof of the pudding28

Page 29: Swertz Molgenis Bosc2009

|[email protected] & BOSC 2009 |

Ongoing work

29

Page 30: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 |

Next step: add processing30

Sheets thanks to Joeri van der Velde and Danny Arends

Generalize for all MOLGENIS instances:

(1) Extend MOLGENIS model for tool integration<tool name=“rqtl”>

<input name=“data” entity=“data”/>…

</tool>

(2) Integrate workflow definition and execution

Extending on Taverna/Galaxy model & APIs…

Generalize for all MOLGENIS instances:

(1) Extend MOLGENIS model for tool integration<tool name=“rqtl”>

<input name=“data” entity=“data”/>…

</tool>

(2) Integrate workflow definition and execution

Extending on Taverna/Galaxy model & APIs…

Page 31: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 |

Next step: semantics

The Pheno-OM project› Integrating mouse and man

Generalize for all MOLGENIS instances:

Next: Add MOLGENIS components to integrate:

(1) Ontology browsing Extending on BioPortal/OLS frameworks?

(2) Semantic integration layer ???

Generalize for all MOLGENIS instances:

Next: Add MOLGENIS components to integrate:

(1) Ontology browsing Extending on BioPortal/OLS frameworks?

(2) Semantic integration layer ???

Page 32: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 |

› Exploit standard generated interfaces› Big distribute big data and tools› Meta analysis

Federation? Cloud computing?32

Page 33: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 |

Acknowledgements

Joeri van der VeldeJoris LopsTomasz AdamusiakDanny ArendsMartijn DijkstraMatthijs KattenbergTjeerd AbmaAte BoeremaHenrikki AlmusaRudi AlbertsDamian SmedleyKaty WolstencroftAndrew R. Jones Bruno M. TessonRichard A. ScheltemaGonzalo Vera RodriguezRene Oostergo

Helen E. Parkinson Ritsert C. JansenCisca WijmengaCarole GobleMarco RoosM. Scott MarshallPaul Schofield John M. HancockJuha MuiluKlaus SchughartEngbert O. de BrockHans Hillegethe LifeLines consortiumthe Trial Coordination Centerthe GEN2PHEN consortiumthe CASIMIR consortiumthe NBIC/BioAssist consortium

33

Page 34: Swertz Molgenis Bosc2009

|[email protected]

EBI

DAM & BOSC 2009 |

› See us at the NBIC booth

› Add generator targets- We have funding for

several positions (PhD,SE)

› Read more:- MOLGENIS for data integration:

Smedley et al 2009, Brief. in Bioinformatics 9(6):532- Review of MOLGENIS type of systems (dated)

Swertz & Jansen 2007, Nature Rev. Genetics 8(3):235- First MOLGENIS, in those times in PHP

Swertz et al 2004, Bioinformatics 20(4)L2075

Questions34

http://www.molgenis.org

http://www.xgap.org