13
Mining publicly available microarray data Frances Turner [email protected]

Mining publicly available microarray data Frances Turner [email protected]

Embed Size (px)

Citation preview

Page 1: Mining publicly available microarray data Frances Turner Fsturner@ic.ac.uk

Mining publicly available microarray data

Frances Turner

[email protected]

Page 2: Mining publicly available microarray data Frances Turner Fsturner@ic.ac.uk

Introduction

• Publicly available data

• Method for data mining

• Application to Tuberculosis and Campylobacter

Page 3: Mining publicly available microarray data Frances Turner Fsturner@ic.ac.uk

Capsule synthesis in C.jejuni

• In which dataset(s) do these genes show changed expression?

• Identify useful data

• Improve biological understanding

Page 4: Mining publicly available microarray data Frances Turner Fsturner@ic.ac.uk

Publicly available data

• Increasing volume of data

• Different depositories

• Different standards

• Difficult to compare experiments

Page 5: Mining publicly available microarray data Frances Turner Fsturner@ic.ac.uk

Publicly available data

Campylobacter

18 experiments126 conditions

M.bovis/M.tuberculosis

34 experiments539 conditions

Page 6: Mining publicly available microarray data Frances Turner Fsturner@ic.ac.uk

Identification of sets of differentially expressed genes

• GSEA commonly used (Subramanian et al 2005)

• Threshold independent

• Small but biologically significant

changes

Page 7: Mining publicly available microarray data Frances Turner Fsturner@ic.ac.uk

GSEA applied to multiple expression datasets

Cj1099Cj0812Cj1494cCj1457cCj0434Cj1307Cj0028Cj1294Cj1393Cj1303Cj1368Cj0597Cj1309cCj0505c

Page 8: Mining publicly available microarray data Frances Turner Fsturner@ic.ac.uk

GSEA applied to multiple expression datasets

Cj1099Cj0812Cj1494cCj1457cCj0434Cj1307Cj0028Cj1294Cj1393Cj1303Cj1368Cj0597Cj1309cCj0505c

Cj0172Cj1099Cj0028Cj0812Cj1494cCj0741Cj1457cCj1303Cj0434Cj1393Cj1307Cj1294Cj1393Cj1309c

Cj0812Cj1494cCj1307Cj0434Cj1393Cj0028Cj1294Cj0597Cj0145cCj1368Cj0432Cj1309cCj0505c

Page 9: Mining publicly available microarray data Frances Turner Fsturner@ic.ac.uk

GSEA applied to multiple expression datasets

• Allows correction for multiple datasets

• Not confounded by correlations between datasets

Page 10: Mining publicly available microarray data Frances Turner Fsturner@ic.ac.uk

Capsule synthesis in C.jejuni

Condition Data set Direction of change

p-value

Anaerobic v control

E-BUGS-19 Down 3.75 e-06

Microaerobic v control

E-BUGS-19 Down 2.88 e-05

dksA mutant v wild type

GSE9866 Down 2.55 e-07

Invivo v chemostat

GSE9942 Down 2.63 e-09

CmeR mutant v wild type

GSE5421 Both 7.21 e-09

Page 11: Mining publicly available microarray data Frances Turner Fsturner@ic.ac.uk

Nitrogen metabolism in M.bovis

Condition Data set Direction of change

p-value

Anaerobic M.bovis v control M.tuberculosis

GSE11315 Down 0.001

M.bovis v control M.tuberculosis

GSE11315 Down 0.002

M.tuberculosis 4mM H2O2 v M.tuberculosis control

GSE365 Down 0.002

Mpr mutant v control GSE6750 Up 0.003

espR mutant v control GSE12379 Down 0.007

Page 12: Mining publicly available microarray data Frances Turner Fsturner@ic.ac.uk

Collect available microarray data

Put different datasets in to comparable formats

GSEA based analysis

Identification of experimental conditions of interest

Summary

Page 13: Mining publicly available microarray data Frances Turner Fsturner@ic.ac.uk

Work in progress

• Collaboration with Chris Tomlison to

create user interface

• Host of CISBIC server

• Allow users to test their own gene

sets or expression datasets.