1
! ! ! ! ! Probe-sets are annotated to GO-nodes. For each annotated GO-node one classifier using centroid method is implemented. Each node obtains a weight according to its deviance. Results of children are collected in their parents through weighted sums. Shrinkage removes uninformative nodes. Each node computes classification probabilities only based on information about genes involved in the corresponding biological aspec. the nearest shrunken [Tibshirani et al., 2002] ! ! ! ! Dataset: 327 cases of acute lymphoblastic leukemia (ALL), Hybridised on HG-U95Av2 (12625 probe-sets) [Yeoh et al., 2002]. StAM analysis to detect MLL based on the biological-process-branch of GO only. Testset:109 cases including 7 MLL. various translocation types ! ! ! ! ! Graph to the left: nodes involved in MLL prediction after shrinkage. Image above: probabiliy for correct MLL classification exclusively using information on the biological aspects mentioned to the right. Cluster similar patients together. Cluster similar GO-nodes together. Molecular symptoms appear as blocks corresponding to particular groups of patients. [GO Consortium, 2000] [Tibshirani et al., 2002] R. Tibshirani, T. Hastie, B. Narasimhan. Chu G: Multi-class diagnosis of cancers using shrunken centroids of gene expression. Proc Natl Acad Sci USA 99: 6567-6572, 2002. Yeoh et al., 2002] [ The Gene Ontology Consortium. Gene ontology: Tool for the unification of biology. Nature Genetics, 25:25--29, May 2000. E. J. Yeoh, M. E. Ross et al. Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell, 1:133--145, March 2002. Biological Informed Classification: StAM - Structured Analysis of Microarrays ! ! ! ! ! In spite of shrinkage the amount of data is overwhelming. We implemented an interactive browser as a JAVAapplication. JAVA-Swing components allow for enhanced interactivity. Reads GO from text format and StAM results from XML files. The browser shows the GO structure, prediction results per node and test case as well as the list of genes used in annotated GO-nodes. Interactive Navigation - Browse through Preduction Results Model Inspection - Discovering Molecular Symptoms References This work was done within the context of the and is supported by Projektträger Jülich ! ! ! ! Goal Problem: Shortcomings in traditional approaches: Our approach : Medical diagnosis and research using gene expression patterns. Many genes - few samples - overfitting Genes are treated as anonymous variables : Use functional annotation from gene ontology in addition to expression data. GO:0008652 amino acid biosznthesis GO:0009308 amine metabolism GO:00042401 biogenic amine synthesis GO:0006520 amino acid metabolism GO:0006576 biogenic amine metabolism GO:0009309 amine bio- synthsis Navigating towards Biologically Resolved Diagnoses through Structured Analysis of Microarrays Navigating towards Biologically Resolved Diagnoses through Structured Analysis of Microarrays Claudio Lottaz , Julie Floch , Renate Kirschner , Christian Hagemeier , Rainer Spang Max-Planck-Institute for Molecular Genetics, Ihnestr.73, D-14195 Berlin Medical Center Charité, Augustenburger Platz 1, D-13353 Berlin 1 1 2 2 1 1 2 GO StAM Indented text XML-file Molecular Symptom I Molecular Symptom III Molecular Symptom II D C D‘ D\D‘ C Molecular Symtoms: ! ! ! ! ! Setup: Problem: Approach: Molecular Symptoms: Characteristics: Control and Disease Group There might be more than one molecular disorder that leads to the disease Subclass finding Signatures that separate the control group from a subclass of the disease group high specificity and suboptimal sensitivity, not all patients need to show a certain molecular symptom, patients can show more than one molecular symptom

Navigating towards Biologically Resolved Diagnoses through ...compdiag.molgen.mpg.de/docs/ECCB03_lottaz_poster.pdf · metabolism GO:0006576 biogenic amine metabolism GO:0009309 amine

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Navigating towards Biologically Resolved Diagnoses through ...compdiag.molgen.mpg.de/docs/ECCB03_lottaz_poster.pdf · metabolism GO:0006576 biogenic amine metabolism GO:0009309 amine

Probe-sets are annotated to GO-nodes.For each annotated GO-node one classifier using

centroid method is implemented.Each node obtains a weight according to its deviance.Results of children are collected in their parents throughweighted sums.Shrinkage removes uninformative nodes.

Each node computes classification probabilities only based on informationabout genes involved in the corresponding biological aspec.

the nearestshrunken [Tibshirani et al., 2002]

Dataset: 327 cases of acute lymphoblastic leukemia (ALL),

Hybridised on HG-U95Av2 (12625 probe-sets) [Yeoh et al., 2002].StAM analysis to detect MLL based on the biological-process-branch ofGO only.Testset:109 cases including 7 MLL.

varioustranslocation types

Graph to the left: nodes involved in MLL prediction after shrinkage.

Image above: probabiliy for correct MLL classification exclusively usinginformation on the biological aspects mentioned to the right.Cluster similar patients together.Cluster similar GO-nodes together.Molecular symptoms appear as blocks corresponding to particular groups ofpatients.[GO Consortium, 2000]

[Tibshirani et al., 2002] R. Tibshirani, T. Hastie, B. Narasimhan. Chu G: Multi-class diagnosis of cancers usingshrunken centroids of gene expression. Proc Natl Acad Sci USA 99: 6567-6572, 2002.

Yeoh et al., 2002][

The Gene Ontology Consortium. Gene ontology: Tool for the unification of biology.Nature Genetics, 25:25--29, May 2000.

E. J. Yeoh, M. E. Ross et al. Classification, subtype discovery, and prediction of outcome inpediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell, 1:133--145, March 2002.

Biological Informed Classification:

StAM -Structured Analysis of Microarrays

In spite of shrinkage the amount of data is overwhelming.We implemented an interactive browser as a JAVA application.JAVA-Swing components allow for enhanced interactivity.Reads GO from text format and StAM results from XML files.The browser shows the GO structure, prediction results per node and testcase as well as the list of genes used in annotated GO-nodes.

Interactive Navigation -Browse through Preduction Results

Model Inspection -Discovering Molecular Symptoms

References

This work was done within the context of the and is supported by

Projektträger Jülich

Goal

Problem:Shortcomings in traditional approaches:

Our approach

: Medical diagnosis and research usinggene expression patterns.

Many genes - few samples - overfittingGenes are treated as

anonymous variables: Use functional annotation from gene ontology in

addition to expression data. GO:0008652amino acidbiosznthesis

GO:0009308amine metabolism

GO:00042401biogenic amine synthesis

GO:0006520amino acidmetabolism

GO:0006576biogenic aminemetabolism

GO:0009309amine bio-

synthsis

Navigating towards Biologically Resolved Diagnosesthrough Structured Analysis of Microarrays

Navigating towards Biologically Resolved Diagnosesthrough Structured Analysis of Microarrays

Claudio Lottaz , Julie Floch , Renate Kirschner ,

Christian Hagemeier , Rainer Spang

Max-Planck-Institute for Molecular Genetics, Ihnestr.73, D-14195 Berlin

Medical Center Charité, Augustenburger Platz 1, D-13353 Berlin

1 1 2

2 1

1

2

GO

StAM

Indentedtext

XML-file

MolecularSymptom I

MolecularSymptom III

MolecularSymptom II

D C D‘ D\D‘ C

Molecular Symtoms:

Setup:Problem:

Approach:Molecular Symptoms:

Characteristics:

Control and Disease GroupThere might be more than one molecular

disorder that leads to the diseaseSubclass finding

Signatures that separate thecontrol group from a subclass of the disease group

high specificity and suboptimalsensitivity, not all patients need to show a certainmolecular symptom, patients can show more thanone molecular symptom