14
Spanish Inquisition Final Project Week 4 - 5/21/09 Breast Cancer Gene Expression Data Leon Kay, Yan Tran, Chris Chris Yan Leon

Spanish Inquisition Final Project Week 4 - 5/21/09 Breast Cancer Gene Expression Data Leon Kay, Yan Tran, Chris Thomas Chris Yan Leon

  • View
    217

  • Download
    3

Embed Size (px)

Citation preview

Spanish Inquisition

Final Project Week 4 - 5/21/09Breast Cancer Gene Expression Data

Leon Kay, Yan Tran, Chris Thomas

Chris

YanLeon

Cluster Analysis - SAM

• Refined Clusters Using TMEV’s SAM Statistical Analysis

• Significance Analysis of Microarrays– determining whether changes in gene expression are

statistically significant. – identifies statistically significant genes by measuring

the strength of the relationship between gene expression and a response variable

MeV SAM Analysis - Results

• Creation of SAM file – Used Excel 2007 to manually create the SAM load file.

• SAM reduces number of genes to 265 significant genes, and 1279 non-significant genes (1544 total genes).

• SAM analysis reduces the number of genes to 17% of the original total.

MeV SAM Analysis – Significant Genes Graph

MeV SAM Analysis – Non-significant Genes Graph

Kaplan-Meier Survival Analysis

• Used to estimate the overall likelihood of survival, given a set of lifetime data

• Generated using the Excel Plug-in– www.xlstat.com – Thanks Sri!

• A plot of the Kaplan-Meier estimate of the survival function is a series of horizontal steps of declining magnitude which, when a large enough sample is taken, approaches the true survival function for that population.

Survival Analysis – Breast Cancer Type

Survival distribution function - Luminal A

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 20 40 60 80 100 120

Overall suvival months

Survival distribution function - Basal-like

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 10 20 30 40 50 60 70 80

Overall suvival months

Survival distribution function - Claudin-low

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 20 40 60 80 100

Overall suvival months

Survival distribution function - Luminal B

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 20 40 60 80 100

Overall suvival months

Survival distribution function - HER2+/ER-

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 20 40 60 80 100 120

Overall suvival months

Survival distribution function - Normal Breast-like

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 5 10 15 20 25 30 35 40

Overall suvival months

Survival Analysis - Overall

Survival distribution function

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 20 40 60 80 100 120

Overall suvival months

-Log(SDF)

0

0.2

0.4

0.6

0.8

1

1.2

0 20 40 60 80 100 120 140

Overall suvival months

Relapse Probability

• 30 out of 270 patients relapsed. Only 270 patients in the clinical data has information recorded one way or the other for relapsing.

• This gives a relapse rate of .1111, or 11.11%• Calculating a 99% confidence interval, we get +/- 0.049. • The final probability of relapse, with 99% certainty,

is .1111 +/- 0.049.• Or, 11.11% +/- 4.9%, for a min and max range of

(6.21%, 16.01%)

Relevance Networks

• The MeV manual states that a “relevance network is a group of genes whose expression profiles are highly predictive of one another.”

• Clusters are represented as genes connected together by lines showing that they are related to each other by a correlation coefficient R2 within preset thresholds.

Relevance Networks

The breast cancer data yielded 14 relevance networks.

GATA3

• In week two we mentioned the GATA3 gene

• Linked to the estrogen receptor alpha.

• Method for providing prognosis because the expression profile is very different between Basal-like and Luminal.

• Will GATA3 show-up as a significant gene after post SAM analysis and will we find the gene associated with estrogen receptor alpha with it?

Relevance Networks

GATA3 and ESR1 are in network 2.

References

• 1) Edward L. Kaplan, “This Week’s Citation Classic”, Current Contents June 1983 http://www.garfield.library.upenn.edu/classics1983/A1983QS51100001.pdf