Upload
lydia-walsh
View
215
Download
2
Embed Size (px)
Citation preview
1
Update on ArrayExpress & standards
Ugis Sarkans, EBI
2
MAGE-OM• MAGE-OM: MicroArray Gene Expression
Object Model– in January 2002 became an “adopted” OMG
specification– January to August 2002 - finalization process– in September became an “available”
specification– should be set in stone for the next 2 years– thinking about MAGE v2 started
• user feedback
• support for other types of functional genomics data
• more precise handling of data manipulation
3BioEvent
Experiment
ArrayDesign
BioMaterial BioAssayData
BioAssay
DesignElement
UML Packages of MAGE
HigherLevelAnalysis
BioSequence
Array QuantitationType
DescriptionProtocol
MeasurementAuditAndSecurity
BQS
what was used what was done results
miscellaneous
4
MAGE*• MAGE-ML: MicroArray Gene Expression
Markup Language– generated from MAGE-OM, therefore evolved
automatically– translation from Jan 2002 to Sep 2002 DTD
quite easy
• MAGEstk: MicroArray Gene Expression Software ToolKit– Jamboree IV in Stanford, beginning of
December– used in MIAMExpress (MAGE-ML export)
5
MAGEstk
• Programming APIs• Mapping of MAGE-OM to language-specific
OMs• API’s are automatically generated from the
OM specifications– get/set methods for associations– get/set methods for attributes
• XML <=> language-specific OM marshallers/unmarshallers - also automatically generated
6
MAGEstk (cont.)
• Use opensource/standard modules/packages– Xerces, JDBC, etc.
• Implementation in Java, C++, Perl, Python
• database access modules on top of these APIs– Postgres schema– DB access layer
• annotation tools - planned
7
ArrayExpress: data
• Currently - 9 experiments, 4 array designs:– from EMBL - human, yeast– from Sanger - pombe
• Coming:– array descriptions: Affymetrix, Agilent– labs: TIGR, Utrecht, more from Sanger, ...– export from existing DBs: SMD, RAD– tools - MAGE-ML export: BASE, maxd, ...– ILSI project
8
ArrayExpress(Oracle + Tomcat)
OtherMicroarraydatabases
www
EBI
ExpressionProfiler
ExternalBioinformatics
databases
Data analysis
www
Queries
www
MIAMExpress(MySQL)
MAGE-ML
Submissions
Array Manufacturers
LIMS
Microarray
software
Data Analysissoftware
ArrayExpress Infrastructure
MAGE-ML import,
export
Local MIAMExpressInstallations
Data
pipelines
MAGE-ML
9
Tomcat
ArrayExpress architecture
ArrayExpress(Oracle)
MAGE-ML(DTD)
MAGE-OMMAGE-ML (doc)MAGE-ML (doc)MAGE-ML (doc)
MAGEloader
Velocitytemplateengine
Castor
object/relationalmapping
Web pagetemplateWeb page
template
Java servlets
MAGEvalidator
MAGEunloader
error.log
10
Array Design- accession
- name
Protocol- accession
Experiment- accession
Organisation- name
Array
Species Sample
Hybridisation
ExperimentDesign
ExperimentType
ExperimentalFactor
Person- last name
Protocol Type
Queries
11
12
13
Experiment plan display
14
ArrayExpress: other technical details
• Data matrices - stored in NetCDF format:– binary format for efficient storage of
multidimensional array
• Arrays - stored as ADF spreadsheets (in addition to normal MAGE structures)
15
16
17
ratio absolute change
confidence measure
namedesign element type
speciessample type
bioassay type
performer labexper. type
array design name
platform type
provider
Properties Properties
Properties
Properties Properties
Data warehouse - forgene- and data-drivenqueries
18
In development
• Immediate:– interface efficiency improvements– BioAssays - graphical display– better integration with Expression Profiler
• Medium-term:– user management
• non-public data (e.g., for reviewers)
– MAGE-ML export
• Curation tool
19
Microarray Informatics team at EBIAlvis Brazma - group leader
ArrayExpress Curation MIAMExpress
•Ugis Sarkans
•Gonzalo Garcia •Helen Parkinson •Mohammadreza Shojatalab
Expression Profiler
•Jaak Vilo
Research, students•Thomas Schlitt•Katja Kivinen•Johan Rung•Patrick Kemmeren
•Misha Kapushesky•Lev Soinov
•Koichi Tazaki
•Anastasia Samsonova
•Susanna Sansone•Philippe Rocca-Serra•Ele Holloway
•Niran Abeyguna- wardena
•Ahmet Oezcimen
•Gaurab Mukherjee •Sergio Contrino
•Anjan Sharma
•Aurora Torrente