Upload
darren-blankenship
View
214
Download
1
Embed Size (px)
Citation preview
Developed at the Broad Institute of MIT and Harvard
Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38 no. 5
(2006): pp500-501
GenePattern is supported by funding from the NIH
Today…
• Introduction to GenePattern– Why
– What
– How
• Demonstration
• Summary
Challenges
• Modern research methods follow a more integrative approach
• Tools are not available to biomedical researchers
• Tools are difficult to use
• Results difficult to interpret correctly
Purpose
• Create tools that are easily accessible to biomedical researchers
• Allows for a combination of multiple data sources and methods
• Allows for “reproducible research”
GenePattern
1. Offers a repository of analytic and visualization tools: Modules
2. Easy creation of complex methods from these tools: Pipelines
3. The rapid development and dissemination of new methods: Programming Environment
1. Modules
• Point and click
• ~ 60 analysis modules (handout)
• Documentation
• Designed for Affymetrix data
• 14 different file extensions
2. Pipelines
• Golub et al illustrates need
• Records the methods, parameters and data to ensure reproducibility
• Allows methods to be “chained”
• Published or create new
• Easily shared
• Assigns version numbers
3. Programming environment
• Libraries allow transparent access to GenePattern modules from R, Matlab and Java
• Language independent mechanism to add new tools to the module repository
• Tools can be your own or public (e.g. from Bioconductor)
Functional Architecture
Taken from Reich et al Nature Genetics 2006
Components
1. The GenePattern server
2. The Java Client
3. The Web Client
Software Architecture
Reich et al Nature Genetics 2006
GenePattern• Current version
– Release: 2.0.1, Release date 3/2/2006
• OS compatibility:
– Windows: XP, 2000, 2003
– Mac: OS X 1.3.9 or later
– Unix: Linux, Solaris, Tru64
• Hardware requirements:
– 256MB RAM
– 500MB disk space
Demonstration
http://www.broad.mit.edu/cancer/software/genepattern/
Gene Expression Analysis
• Four broad categories1. Differential analysis/Marker selection
2. Prediction
3. Class discovery
4. Pathway analysis
• Data Formats
• Annotations
Proteomics
• SELDI, MALDI and LC-MS in mzXML format
• Quality assessment
• Peak detection
• Spectra comparison
• Proteomic analysis pipeline
• Data conversion
SNP analysis
• In alpha testing
• Uses high-density SNP microarray data
• Copy number alterations
• Loss of heterozygosity (LOH) detection
Data preprocessing and conversion
• Importing, exporting and file conversion
• Normalization, filtering and imputing
• ID conversion and annotation
• Row and column extraction, transpose, reorder and split data
Comparison of Selected Microarray Analysis Software Platforms
Reich et al Nature Genetics 2006
Summary
• Has a few minor problems
• Is it something MIBLab can use?– Who is user?
– What is it missing? Should be easily added
SourcesGould J, Getz G, Monti S, Reich M, Mesirov JP. Comparative Gene
Marker Selection suite. Bioinformatics. 2006 May 18;
Liefeld T, Reich M, Gould J, Zhang P, Tamayo P, Mesirov JP. GeneCruiser: a web service for the annotation of microarray data. Bioinformatics. 2005 Sep 15;21(18):3681-2.
Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, Mesirov JP. GenePattern 2.0. Nature Genetics 2006 May;38(5):500-1.