Alex Lewin (Imperial College Centre for Biostatistics) Ian Grieve (IC Microarray Centre) Elena Kulinskaya (IC Statistical Advisory Service) Improving Interpretation

Alex Lewin (Imperial College Centre for Biostatistics)

Ian Grieve (IC Microarray Centre)Elena Kulinskaya (IC Statistical Advisory Service)

Improving Interpretation in Gene Set Enrichment Analysis

Introduction

• Microarray experiment list of differentially expressed (DE) genes

• Genes belong to categories of Gene Ontology (GO)

• Are some GO categories (groups of genes) over-represented amongst the DE genes?

Contents

• Grouping Gene Ontology categories can improve interpretation of gene set enrichment analysis

• Fuzzy decision rules for multiple testing with discrete data

Gene Ontology (GO)

Database of biological terms

Arranged in graph connecting related terms: links from more general to more specific terms

For each node, can define ancestor and descendant terms

Directed Acyclic Graph

~16,000 terms

from QuickGO website (EBI)

Gene Annotations

• Genes/proteins annotated to relevant GO terms– Gene may be annotated to several GO terms – GO term may have 1000s of genes annotated to it (or

none)

• Gene annotated to term A annotated to all ancestors of A

Find GO terms over-represented amongst differentially expressed genes

For each GO term, compare:

proportion of differentially expressed genes annotated to that term

v.

proportion of non-differentially expressed genes annotated to that term

Fisher’s test p-value for each GO term.

Multiple testing considerations threshold below which p-values are declared significant.

Many websites do this type of analysis, eg FatiGO website http://fatigo.bioinfo.cnio.es/

22

173 7847

467GO

not

notDE

Difficulties in Testing GO terms

Interpretation: many terms close in the graph may be found significant – or not significant but many low p-values close together in the graph

Statistical Power: many terms have few genes annotated

Discrete statistics: p-values not Uniform under null

Grouping GO terms

Use the Poset Ontology Categorizer (POSOC)

Joslyn et al. 2004

Software which groups terms based on

- pseudo-distance between terms

- ‘coverage’ of genes

Example: for data used here, reduces ~16,000 terms to 76 groups

Example: genes associated with the insulin-resistance gene Cd36

Knock-out and wildtype mice

Bayesian hierarchical model gives posterior probabilities (pg) of being differentially expressed

Most differentially expressed:

pg > 0.5 (280 genes)

Least differentially expressed:

pg < 0.2 (11171 genes)

Example Results

Individual term tests

Used Fatigo website

Multiple testing corrections (Benjamini and Hochberg FDR) done separately for each ‘level’

Found no GO terms significant when FDR controlled at 5%

Group tests

POSOC on all genes on U74A chip, gives 76 groups

3 groups found significant when controlling FDR at 5%

Comparison of Individual and Group Tests

Rank in Fatigo (smallest p-values) Membership of POSOC group significant

1: response to external stimulus

2: resp. to pest, pathogen or parasite

3: response to wounding

4: organismal movement

5: response to biotic stimulus

6: neurophysiological process

7: response to stress

8: inflammatory response

9: transmission of nerve impulse

10: neuromuscular physiological proc.

11: defense response

12: immune response

13: chemotaxis

14: nucleobase, nucleoside, nuc …

15: cell-cell signalling

IA

response to p.p.p.

response to wounding

IA

IA

-

IA

immune resp, resp. to ppp, resp to wound

-

-

IA



chemotaxis,

cell-migration

-

-

IA

yes

yes

IA

IA

-

IA

yes

-

-

IA

yes

yes

no (at 5%)

no

-

-

IA = immediate ancestor of significant POSOC group

Physiological process`

Organismal movement

Inflammatory response

Response to stimulus

Response to external stimulus

Response to biotic stimulus

Response to stress

Response to wounding

Defense response

Response to pest, pathogen

or parasiteImmune response

Biological process

Response to other

organism

Ranks high individually (smallest p-values)

Significant in group tests (and ranks high individually)

Comparison of Individual and Group Tests

Discrete test statistics

Null hypothesis determined by margins of 2x2 table

Often very small no. possible values for cells small no. possible p-values

X

173 7847

467GO

not

notDE

Null Hypothesis:

X ~ HyperGeom(173, 7847-173, 467)

X = 0,…,173

Discrete test statistics

X

173 7847

467GO

not

notDE

p-value p(x) = P( X ≤ x | null )

P( p ≤ α | null) ≠ α for most α

Randomised Test

Observe X=x0

pobs = observed p-value = P( X ≤ x0 | null )

pprev = next smallest possible p-value = P( X ≤ x0-1 | null )

Randomised p-value

P(x0) = P( X < x0 | null ) + u*P( X = x0 | null ) where u ~ Unif(0,1)

= pprev + u*(pobs - pprev)

conditionally, P | x0 ~ Unif(pprev , pobs) unconditionally P ~ Unif(0,1)

pobs0 1pprev

Fuzzy Decision Rule

Idea is to use all possible realisations of randomised test.

Summarise evidence by critical function of randomised test:

τα(pprev , pobs) =

1 pobs < α

(α – pprev)/(pobs - pprev) pprev < α < pobs

0 pprev > α pobs0 1pprev

Use τα as a fuzzy measure of evidence against the null hypothesis.

(Fuzzy decision rule considered by Cox & Hinckley, 1974 and developed by Geyer and Meeden 2005)

Fuzzy Decision Rules for Multiple Testing

We have developed fuzzy decision rules for multiple tests (i = 1,…,m)

Use Benjamini and Hochberg false discovery rate (BH FDR)

τBHα(pi

prev , pi

obs ) = P( randomised p-value i is rejected | null )

using BH FDR procedure

For small no. tests we can calculate these exactly.

Fuzzy Decision Rules for Multiple Testing

τBHα(pi

prev , pi

obs ) = P( randomised p-value i is rejected | null )

For large no. tests use simulations:

for j = 1,…,n {

generate randomised p-values (i=1,…,m) Pij ~ Unif (piprev

, piobs

)

perform BH FDR procedure Iij =

}

τBHα(pi

prev , pi

obs ) = 1/n Σj Iij

1 if Pij rejected

0 else

^

Results for Cd36 Example

[1] "alpha = 0.05" pprev pval i.bonf i.bh tau POSOC group1 1e-04 3e-04 1 1 1 response to pest, pathogen or parasite 2 1e-04 4e-04 1 1 1 response to wounding 3 2e-04 6e-04 1 1 1 immune response 4 7e-04 0.0079 0 0 0.297 digestion 5 0.003 0.0122 0 0 0.021 chemotaxis 6 0.0039 0.0209 0 0 0.002 organic acid biosynthesis 7 0.0092 0.0306 0 0 0 synaptic transmission 8 5e-04 0.0436 0 0 0.059 response to fungi

[1] "alpha = 0.15" pprev pval i.bonf i.bh tau POSOC group1 1e-04 3e-04 1 1 1 response to pest, pathogen or parasite 2 1e-04 4e-04 1 1 1 response to wounding 3 2e-04 6e-04 1 1 1 immune response 4 7e-04 0.0079 0 1 1 digestion 5 0.003 0.0122 0 0 0.943 chemotaxis 6 0.0039 0.0209 0 0 0.661 organic acid biosynthesis 7 0.0092 0.0306 0 0 0.375 synaptic transmission 8 5e-04 0.0436 0 0 0.391 response to fungi

Results for Cd36 Example

Order of fuzzy decisions is not the same as order of observed p-values

Depends on amount of discreteness of null

pobspprev

Conclusions

• Grouping Gene Ontology categories can help find significant regions of the GO graph

• Fuzzy decision rules for multiple testing with discrete data can provide more candidates for rejection

Acknowledgements

Acknowledgements

Cliff Joslyn (Los Alamos National Laboratory)

Tim Aitman (IC Microarray Centre)

Sylvia Richardson (IC Centre for Biostatistics)

BBSRC ‘Exploiting Genomics’ grant (AL)

Wellcome Trust grant (IG)

References

Joslyn CA, Mniszewski SM, Fulmer A and Heaton G (2004), The Gene Ontology Categorizer, Bioinformatics 20, 169-177.

Geyer and Meeden (2005), Fuzzy Confidence Intervals and P-values, Statistical Science, to appear.

Documents

Alex Lewin (Imperial College Centre for Biostatistics) Ian Grieve (IC Microarray Centre) Elena Kulinskaya (IC Statistical Advisory Service) Improving Interpretation