�
�
�
�
�
Probe-sets are annotated to GO-nodes.For each annotated GO-node one classifier using
centroid method is implemented.Each node obtains a weight according to its deviance.Results of children are collected in their parents throughweighted sums.Shrinkage removes uninformative nodes.
Each node computes classification probabilities only based on informationabout genes involved in the corresponding biological aspec.
the nearestshrunken [Tibshirani et al., 2002]
�
�
�
�
Dataset: 327 cases of acute lymphoblastic leukemia (ALL),
Hybridised on HG-U95Av2 (12625 probe-sets) [Yeoh et al., 2002].StAM analysis to detect MLL based on the biological-process-branch ofGO only.Testset:109 cases including 7 MLL.
varioustranslocation types
�
�
�
�
�
Graph to the left: nodes involved in MLL prediction after shrinkage.
Image above: probabiliy for correct MLL classification exclusively usinginformation on the biological aspects mentioned to the right.Cluster similar patients together.Cluster similar GO-nodes together.Molecular symptoms appear as blocks corresponding to particular groups ofpatients.[GO Consortium, 2000]
[Tibshirani et al., 2002] R. Tibshirani, T. Hastie, B. Narasimhan. Chu G: Multi-class diagnosis of cancers usingshrunken centroids of gene expression. Proc Natl Acad Sci USA 99: 6567-6572, 2002.
Yeoh et al., 2002][
The Gene Ontology Consortium. Gene ontology: Tool for the unification of biology.Nature Genetics, 25:25--29, May 2000.
E. J. Yeoh, M. E. Ross et al. Classification, subtype discovery, and prediction of outcome inpediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell, 1:133--145, March 2002.
Biological Informed Classification:
StAM -Structured Analysis of Microarrays
�
�
�
�
�
In spite of shrinkage the amount of data is overwhelming.We implemented an interactive browser as a JAVA application.JAVA-Swing components allow for enhanced interactivity.Reads GO from text format and StAM results from XML files.The browser shows the GO structure, prediction results per node and testcase as well as the list of genes used in annotated GO-nodes.
Interactive Navigation -Browse through Preduction Results
Model Inspection -Discovering Molecular Symptoms
References
This work was done within the context of the and is supported by
Projektträger Jülich
�
�
�
�
Goal
Problem:Shortcomings in traditional approaches:
Our approach
: Medical diagnosis and research usinggene expression patterns.
Many genes - few samples - overfittingGenes are treated as
anonymous variables: Use functional annotation from gene ontology in
addition to expression data. GO:0008652amino acidbiosznthesis
GO:0009308amine metabolism
GO:00042401biogenic amine synthesis
GO:0006520amino acidmetabolism
GO:0006576biogenic aminemetabolism
GO:0009309amine bio-
synthsis
Navigating towards Biologically Resolved Diagnosesthrough Structured Analysis of Microarrays
Navigating towards Biologically Resolved Diagnosesthrough Structured Analysis of Microarrays
Claudio Lottaz , Julie Floch , Renate Kirschner ,
Christian Hagemeier , Rainer Spang
Max-Planck-Institute for Molecular Genetics, Ihnestr.73, D-14195 Berlin
Medical Center Charité, Augustenburger Platz 1, D-13353 Berlin
1 1 2
2 1
1
2
GO
StAM
Indentedtext
XML-file
MolecularSymptom I
MolecularSymptom III
MolecularSymptom II
D C D‘ D\D‘ C
Molecular Symtoms:
�
�
�
�
�
Setup:Problem:
Approach:Molecular Symptoms:
Characteristics:
Control and Disease GroupThere might be more than one molecular
disorder that leads to the diseaseSubclass finding
Signatures that separate thecontrol group from a subclass of the disease group
high specificity and suboptimalsensitivity, not all patients need to show a certainmolecular symptom, patients can show more thanone molecular symptom