19
1 Update on ArrayExpress & standards Ugis Sarkans, EBI

1 Update on ArrayExpress & standards Ugis Sarkans, EBI

Embed Size (px)

Citation preview

Page 1: 1 Update on ArrayExpress & standards Ugis Sarkans, EBI

1

Update on ArrayExpress & standards

Ugis Sarkans, EBI

Page 2: 1 Update on ArrayExpress & standards Ugis Sarkans, EBI

2

MAGE-OM• MAGE-OM: MicroArray Gene Expression

Object Model– in January 2002 became an “adopted” OMG

specification– January to August 2002 - finalization process– in September became an “available”

specification– should be set in stone for the next 2 years– thinking about MAGE v2 started

• user feedback

• support for other types of functional genomics data

• more precise handling of data manipulation

Page 3: 1 Update on ArrayExpress & standards Ugis Sarkans, EBI

3BioEvent

Experiment

ArrayDesign

BioMaterial BioAssayData

BioAssay

DesignElement

UML Packages of MAGE

HigherLevelAnalysis

BioSequence

Array QuantitationType

DescriptionProtocol

MeasurementAuditAndSecurity

BQS

what was used what was done results

miscellaneous

Page 4: 1 Update on ArrayExpress & standards Ugis Sarkans, EBI

4

MAGE*• MAGE-ML: MicroArray Gene Expression

Markup Language– generated from MAGE-OM, therefore evolved

automatically– translation from Jan 2002 to Sep 2002 DTD

quite easy

• MAGEstk: MicroArray Gene Expression Software ToolKit– Jamboree IV in Stanford, beginning of

December– used in MIAMExpress (MAGE-ML export)

Page 5: 1 Update on ArrayExpress & standards Ugis Sarkans, EBI

5

MAGEstk

• Programming APIs• Mapping of MAGE-OM to language-specific

OMs• API’s are automatically generated from the

OM specifications– get/set methods for associations– get/set methods for attributes

• XML <=> language-specific OM marshallers/unmarshallers - also automatically generated

Page 6: 1 Update on ArrayExpress & standards Ugis Sarkans, EBI

6

MAGEstk (cont.)

• Use opensource/standard modules/packages– Xerces, JDBC, etc.

• Implementation in Java, C++, Perl, Python

• database access modules on top of these APIs– Postgres schema– DB access layer

• annotation tools - planned

Page 7: 1 Update on ArrayExpress & standards Ugis Sarkans, EBI

7

ArrayExpress: data

• Currently - 9 experiments, 4 array designs:– from EMBL - human, yeast– from Sanger - pombe

• Coming:– array descriptions: Affymetrix, Agilent– labs: TIGR, Utrecht, more from Sanger, ...– export from existing DBs: SMD, RAD– tools - MAGE-ML export: BASE, maxd, ...– ILSI project

Page 8: 1 Update on ArrayExpress & standards Ugis Sarkans, EBI

8

ArrayExpress(Oracle + Tomcat)

OtherMicroarraydatabases

www

EBI

ExpressionProfiler

ExternalBioinformatics

databases

Data analysis

www

Queries

www

MIAMExpress(MySQL)

MAGE-ML

Submissions

Array Manufacturers

LIMS

Microarray

software

Data Analysissoftware

ArrayExpress Infrastructure

MAGE-ML import,

export

Local MIAMExpressInstallations

Data

pipelines

MAGE-ML

Page 9: 1 Update on ArrayExpress & standards Ugis Sarkans, EBI

9

Tomcat

ArrayExpress architecture

ArrayExpress(Oracle)

MAGE-ML(DTD)

MAGE-OMMAGE-ML (doc)MAGE-ML (doc)MAGE-ML (doc)

MAGEloader

Velocitytemplateengine

Castor

object/relationalmapping

Web pagetemplateWeb page

template

Java servlets

MAGEvalidator

MAGEunloader

error.log

Page 10: 1 Update on ArrayExpress & standards Ugis Sarkans, EBI

10

Array Design- accession

- name

Protocol- accession

Experiment- accession

Organisation- name

Array

Species Sample

Hybridisation

ExperimentDesign

ExperimentType

ExperimentalFactor

Person- last name

Protocol Type

Queries

Page 11: 1 Update on ArrayExpress & standards Ugis Sarkans, EBI

11

Page 12: 1 Update on ArrayExpress & standards Ugis Sarkans, EBI

12

Page 13: 1 Update on ArrayExpress & standards Ugis Sarkans, EBI

13

Experiment plan display

Page 14: 1 Update on ArrayExpress & standards Ugis Sarkans, EBI

14

ArrayExpress: other technical details

• Data matrices - stored in NetCDF format:– binary format for efficient storage of

multidimensional array

• Arrays - stored as ADF spreadsheets (in addition to normal MAGE structures)

Page 15: 1 Update on ArrayExpress & standards Ugis Sarkans, EBI

15

Page 16: 1 Update on ArrayExpress & standards Ugis Sarkans, EBI

16

Page 17: 1 Update on ArrayExpress & standards Ugis Sarkans, EBI

17

ratio absolute change

confidence measure

namedesign element type

speciessample type

bioassay type

performer labexper. type

array design name

platform type

provider

Properties Properties

Properties

Properties Properties

Data warehouse - forgene- and data-drivenqueries

Page 18: 1 Update on ArrayExpress & standards Ugis Sarkans, EBI

18

In development

• Immediate:– interface efficiency improvements– BioAssays - graphical display– better integration with Expression Profiler

• Medium-term:– user management

• non-public data (e.g., for reviewers)

– MAGE-ML export

• Curation tool

Page 19: 1 Update on ArrayExpress & standards Ugis Sarkans, EBI

19

Microarray Informatics team at EBIAlvis Brazma - group leader

ArrayExpress Curation MIAMExpress

•Ugis Sarkans

•Gonzalo Garcia •Helen Parkinson •Mohammadreza Shojatalab

Expression Profiler

•Jaak Vilo

Research, students•Thomas Schlitt•Katja Kivinen•Johan Rung•Patrick Kemmeren

•Misha Kapushesky•Lev Soinov

•Koichi Tazaki

•Anastasia Samsonova

•Susanna Sansone•Philippe Rocca-Serra•Ele Holloway

•Niran Abeyguna- wardena

•Ahmet Oezcimen

•Gaurab Mukherjee •Sergio Contrino

•Anjan Sharma

•Aurora Torrente