20
Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38 no. 5 (2006): pp500-501 GenePattern is supported by funding from the NIH

Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38

Embed Size (px)

Citation preview

Page 1: Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38

Developed at the Broad Institute of MIT and Harvard

Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38 no. 5

(2006): pp500-501

GenePattern is supported by funding from the NIH

Page 2: Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38

Today…

• Introduction to GenePattern– Why

– What

– How

• Demonstration

• Summary

Page 3: Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38

Challenges

• Modern research methods follow a more integrative approach

• Tools are not available to biomedical researchers

• Tools are difficult to use

• Results difficult to interpret correctly

Page 4: Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38

Purpose

• Create tools that are easily accessible to biomedical researchers

• Allows for a combination of multiple data sources and methods

• Allows for “reproducible research”

Page 5: Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38

GenePattern

1. Offers a repository of analytic and visualization tools: Modules

2. Easy creation of complex methods from these tools: Pipelines

3. The rapid development and dissemination of new methods: Programming Environment

Page 6: Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38

1. Modules

• Point and click

• ~ 60 analysis modules (handout)

• Documentation

• Designed for Affymetrix data

• 14 different file extensions

Page 7: Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38

2. Pipelines

• Golub et al illustrates need

• Records the methods, parameters and data to ensure reproducibility

• Allows methods to be “chained”

• Published or create new

• Easily shared

• Assigns version numbers

Page 8: Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38

3. Programming environment

• Libraries allow transparent access to GenePattern modules from R, Matlab and Java

• Language independent mechanism to add new tools to the module repository

• Tools can be your own or public (e.g. from Bioconductor)

Page 9: Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38

Functional Architecture

Taken from Reich et al Nature Genetics 2006

Page 10: Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38

Components

1. The GenePattern server

2. The Java Client

3. The Web Client

Page 11: Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38

Software Architecture

Reich et al Nature Genetics 2006

Page 12: Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38

GenePattern• Current version

– Release: 2.0.1, Release date 3/2/2006

• OS compatibility:

– Windows: XP, 2000, 2003

– Mac: OS X 1.3.9 or later

– Unix: Linux, Solaris, Tru64

• Hardware requirements:

– 256MB RAM

– 500MB disk space

Page 13: Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38

Demonstration

http://www.broad.mit.edu/cancer/software/genepattern/

Page 14: Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38

Gene Expression Analysis

• Four broad categories1. Differential analysis/Marker selection

2. Prediction

3. Class discovery

4. Pathway analysis

• Data Formats

• Annotations

Page 15: Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38

Proteomics

• SELDI, MALDI and LC-MS in mzXML format

• Quality assessment

• Peak detection

• Spectra comparison

• Proteomic analysis pipeline

• Data conversion

Page 16: Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38

SNP analysis

• In alpha testing

• Uses high-density SNP microarray data

• Copy number alterations

• Loss of heterozygosity (LOH) detection

Page 17: Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38

Data preprocessing and conversion

• Importing, exporting and file conversion

• Normalization, filtering and imputing

• ID conversion and annotation

• Row and column extraction, transpose, reorder and split data

Page 18: Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38

Comparison of Selected Microarray Analysis Software Platforms

Reich et al Nature Genetics 2006

Page 19: Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38

Summary

• Has a few minor problems

• Is it something MIBLab can use?– Who is user?

– What is it missing? Should be easily added

Page 20: Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38

SourcesGould J, Getz G, Monti S, Reich M, Mesirov JP. Comparative Gene

Marker Selection suite. Bioinformatics. 2006 May 18;

Liefeld T, Reich M, Gould J, Zhang P, Tamayo P, Mesirov JP. GeneCruiser: a web service for the annotation of microarray data. Bioinformatics. 2005 Sep 15;21(18):3681-2.

Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, Mesirov JP. GenePattern 2.0. Nature Genetics 2006 May;38(5):500-1.