Upload
lynne-hawkins
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Patricia HernandezGeneva, 28th September 2006
Swiss Bio Grid: Proteomics Project (PP)
Definition:
Set of technologies and methodologies for large-scale studies of proteins identification / characterization / quantification
Context: Proteomics
Definition:
Set of technologies and methodologies for large-scale studies of proteins identification / characterization / quantification
Typical proteomic study:
identify proteins that are differentially expressed between two samples (e.g. normal vs disease state)
Context: Proteomics
Definition:
Set of technologies and methodologies for large-scale studies of proteins identification / characterization / quantification
Typical proteomic study:
identify proteins that are differentially expressed between two samples (e.g. normal vs disease state)
Technology:
mass spectrometry (MSMS) = mass measurement of protein fragments
Context: Proteomics
Identification of proteins: principle
Many available tools; all work in the same way a LIST OF MSMS SPECTRA processed sequentially
a LIST OF POSSIBLE SOLUTIONS e.g. a list of known protein sequences
thousands to miostens to thousands
solutions are (sequentially) evaluated against the spectra using a COMPARISON FUNCTION
some display (OUTPUT) of the identified proteins (with/without additional features such as statistics, result export, etc.)
Identification of proteins: principle
thousands to miostens to thousands
Key idea:
Give access through a unique web portal to several spectrum analysis software in a workflow-oriented data analysis platform.
the swissPIT platform
Key idea and main objectifs of the PP
Key idea:
Give access through a unique web portal to several spectrum analysis software in a workflow-oriented data analysis platform.
the swissPIT platform
Main objectifs:
- increase the coverage of identified proteins
- automatise analysis workflows
- provide a environment for parameter optimisation studies and for benchmarking
Key idea and main objectifs of the PP
Interaction with the user:
MSMS data upload
choice of workflows and parameter configuration
result visualisation
data/result sharing
swissPIT overview: three distinct parts
Execution of the analysis workflow selected by the user
data exploitation or high-throughput centered workflows
task-specific workflows (=personalized for a given lab)
swissPIT overview: three distinct parts
Easy parallelisation
In a workflow, several analysis tools may be called in the same time (and independently)
For a given identification tool, the spectrum list and/or the db can be splitted into bundles and each bundle analysed independently
swissPIT overview: three distinct parts
Use of distributed resources
Each site decides what databases and tools to install and maintain.
Corresponds to the « reality ». Research groups and proteomics facilities are geographically scattered and need to collaborate.
swissPIT overview: three distinct parts
Current status of swissPIT
Web-based interface
4 protein identification tools Phenyx X!Tandem Popitam InsPecT
2 protein sequence databases uniProtKB/swissProt (>230’000 entries) uniProtKB/trEMBL (> 3’180’000 entries)
swissBioGrid compatible (submission to a grid is transparent for the user)
User layer System layer
Current status: swissPIT from inside
User layer System layer
Current status: swissPIT from inside
User layer System layer
submit
mgf
Current status: swissPIT from inside
User layer System layer
submit
mgf
Current status: swissPIT from inside
User layer System layer
submit
mgf
Grid/cluster
Current status: swissPIT from inside
User layer System layer
submit
mgf
Grid/cluster
Current status: swissPIT from inside
http://swisspit.cscs.ch/
usernamepwd
Current status: swissPIT from outside
some globalparameters
Current status: swissPIT from outside
List of software that are installedCheck/uncheck boxes to select software to be run on the data
Current status: swissPIT from outside
Click on link to display and configuresoftware specific parameters
Current status: swissPIT from outside
Click on link to display and configuresoftware specific parameters
Press to run the software
Current status: swissPIT from outside
Go to the user spaceBrowse old/new project
Current status: swissPIT from outside
Results are visualized in native format raw text for Popitam, InsPecT XML with style sheet for X!Tandem advanced java interface for Phenyx
Current status: swissPIT from outside
Code improvement improve readability and maintainability of code
Upcoming work
Code improvement improve readability and maintainability of code
Standardisation unify parameters as much as possible display results in one format
Upcoming work
Code improvement improve readability and maintainability of code
Standardisation unify parameters as much as possible display results in one format
Workflows find a way to implement workflows using xml
configuration files
Upcoming work
screen for unsuspected modificationsscreen for proteinsremove spectra with low peak statistics
Ron Appel
Patricia Hernandez
Celine Hernandez
Andreas Quandt
Marc Tuloup
Pierre-Alain Binz
Markus Müller
Alexandre Masselot, Nicolas Budin
Vital-it team + Bruno Nyffeler
Peter Kunszt, Sergio Maffioletti, Arthur Thomas
Involved people, acknowledgments
Thankyoufor
yourattention
Involved people, acknowledgments