Navigating towards Biologically Resolved Diagnoses through...

Preview:

Citation preview

Probe-sets are annotated to GO-nodes.For each annotated GO-node one classifier using

centroid method is implemented.Each node obtains a weight according to its deviance.Results of children are collected in their parents throughweighted sums.Shrinkage removes uninformative nodes.

Each node computes classification probabilities only based on informationabout genes involved in the corresponding biological aspec.

the nearestshrunken [Tibshirani et al., 2002]

Dataset: 327 cases of acute lymphoblastic leukemia (ALL),

Hybridised on HG-U95Av2 (12625 probe-sets) [Yeoh et al., 2002].StAM analysis to detect MLL based on the biological-process-branch ofGO only.Testset:109 cases including 7 MLL.

varioustranslocation types

Graph to the left: nodes involved in MLL prediction after shrinkage.

Image above: probabiliy for correct MLL classification exclusively usinginformation on the biological aspects mentioned to the right.Cluster similar patients together.Cluster similar GO-nodes together.Molecular symptoms appear as blocks corresponding to particular groups ofpatients.[GO Consortium, 2000]

[Tibshirani et al., 2002] R. Tibshirani, T. Hastie, B. Narasimhan. Chu G: Multi-class diagnosis of cancers usingshrunken centroids of gene expression. Proc Natl Acad Sci USA 99: 6567-6572, 2002.

Yeoh et al., 2002][

The Gene Ontology Consortium. Gene ontology: Tool for the unification of biology.Nature Genetics, 25:25--29, May 2000.

E. J. Yeoh, M. E. Ross et al. Classification, subtype discovery, and prediction of outcome inpediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell, 1:133--145, March 2002.

Biological Informed Classification:

StAM -Structured Analysis of Microarrays

In spite of shrinkage the amount of data is overwhelming.We implemented an interactive browser as a JAVA application.JAVA-Swing components allow for enhanced interactivity.Reads GO from text format and StAM results from XML files.The browser shows the GO structure, prediction results per node and testcase as well as the list of genes used in annotated GO-nodes.

Interactive Navigation -Browse through Preduction Results

Model Inspection -Discovering Molecular Symptoms

References

This work was done within the context of the and is supported by

Projektträger Jülich

Goal

Problem:Shortcomings in traditional approaches:

Our approach

: Medical diagnosis and research usinggene expression patterns.

Many genes - few samples - overfittingGenes are treated as

anonymous variables: Use functional annotation from gene ontology in

addition to expression data. GO:0008652amino acidbiosznthesis

GO:0009308amine metabolism

GO:00042401biogenic amine synthesis

GO:0006520amino acidmetabolism

GO:0006576biogenic aminemetabolism

GO:0009309amine bio-

synthsis

Navigating towards Biologically Resolved Diagnosesthrough Structured Analysis of Microarrays

Navigating towards Biologically Resolved Diagnosesthrough Structured Analysis of Microarrays

Claudio Lottaz , Julie Floch , Renate Kirschner ,

Christian Hagemeier , Rainer Spang

Max-Planck-Institute for Molecular Genetics, Ihnestr.73, D-14195 Berlin

Medical Center Charité, Augustenburger Platz 1, D-13353 Berlin

1 1 2

2 1

1

2

GO

StAM

Indentedtext

XML-file

MolecularSymptom I

MolecularSymptom III

MolecularSymptom II

D C D‘ D\D‘ C

Molecular Symtoms:

Setup:Problem:

Approach:Molecular Symptoms:

Characteristics:

Control and Disease GroupThere might be more than one molecular

disorder that leads to the diseaseSubclass finding

Signatures that separate thecontrol group from a subclass of the disease group

high specificity and suboptimalsensitivity, not all patients need to show a certainmolecular symptom, patients can show more thanone molecular symptom

Recommended