17
Understanding Proteomics Understanding Proteomics through Bioinformatics through Bioinformatics Chris Evelo BiGCaT Bioinformatics Group – BMT-TU/e & UM Masterclass Nutrigenomics; May 11 2004

Understanding Proteomics through Bioinformatics Chris Evelo BiGCaT Bioinformatics Group – BMT-TU/e & UM Masterclass Nutrigenomics; May 11 2004

  • View
    217

  • Download
    2

Embed Size (px)

Citation preview

Understanding Proteomics Understanding Proteomics through Bioinformaticsthrough Bioinformatics

Chris EveloBiGCaT Bioinformatics Group – BMT-TU/e & UMMasterclass Nutrigenomics; May 11 2004

BiGCaT BioinformaticsWhere the cat hunts

BiGCaT Bioinformatics, BiGCaT Bioinformatics, bridge between two universitiesbridge between two universities

Universiteit MaastrichtPatients, Experiments,

Arrays and Loads of Data

TU/eIdeas & Experience in Data Handling

BiGCaT

BiGCaT Bioinformatics,BiGCaT Bioinformatics,between two research fieldsbetween two research fields

CardiovascularResearch

Nutritional &Environmental

Research

BiGCaT

If transcriptomics is:If transcriptomics is:

The study of genome wide geneexpression on the transcriptional level

Where genome wide means: >20K genes. And transcriptional level means that somehow

>20K mRNA sequences have to be analyzed And >20K expression values have to be

filtered, normalized, replicate treated,clustered and understood

Thus no transcriptomics without bioinformatics

Gene expression Gene expression arraysarraysMicroarrays: relative

fluorescense signals. Identification.

Macroarrays: absolute radioactive signal. Validation.

Then proteomics would be:Then proteomics would be:

The study of genome wide gene expression on the translational level

Where genome wide would mean: >20K proteins.

Then proteomics does not yet exist!

Does it already need bioinformatics?

Identification of proteins found Identification of proteins found (method annotation)(method annotation)

Antibody techniques: build in.You know what the antigen is or you wouldn’t use it.

Mass identification:Fragment libraries derived from SwissprotNot normally a user (scientist) problem.Or practically build in as well.

No current need for bioinformaticsBut please use Swissprot ID’s!!

Data filtering and normalizationData filtering and normalization

Appears to become a problem on antibody arrays (see yesterdays presentation by Rachelle van Haaften).

Start with expertise from mRNA microarrays.

Use bioinformatics to improve techniquesNot to cover up problems

2

time

Exp

r. le

vel

Clustering: find proteins with same expression patterns

T1 signal

T2

sig

na

l

Left hand picture shows expression patterns for 2 proteins (these should probably end up in the same cluster).

Right hand picture shows the expression vector for one protein for the first 2 dimensions. Can be normalized by amplitude (circle) or relatively (square).

Clustering and grouping of Clustering and grouping of proteins with parallel expressions proteins with parallel expressions

Fancy techniques clustering, principal component analysis, self organizing maps, etc. etc.

But… Only useful for high numbers (and maybe not even then)

Currently not important for proteomicsBut might be useful in combined mRNA/proteinstudies

Two things left Two things left

Functional understanding of proteomics results Understanding protein modifications

Functional understanding Functional understanding

Map changed proteins (quantitatively or qualitatively) to known pathways.

Or use information from the Gene Ontology (GO) database

Steal and smartly adapt a transcriptomics tool:GenMapp/Mappfinder

Let me show you an example from a simple nutrigenomics (starvation) study.

Data from Johan Renes.

Understanding protein Understanding protein modificationsmodifications

Map changed proteins (quantitatively or qualitatively) to known pathways.

Or use information from the Gene Ontology (GO) database

Steal and smartly adapt a transcriptomics tool:GenMapp/Mappfinder

Let me show you an example from a simple nutrigenomics (starvation) study.

Data from Johan Renes.

Protein variants derived from single genes

Phosphorylation?Modification?

Alternative splicing?Phosphorylation?

Alternative splicing?Modification?

Understanding modificationsUnderstanding modifications

Look up the protein in SwissProtFor instance:– Glyceraldehyde 3-phosphate dehydrogenase – Pyruvate kinase (note splice variants)

Or use Prosite Search For instance:– Glyceraldehyde 3-phosphate dehydrogenase

with: PKC phosphorylation siteand: its own GAPDH pattern

Bioinformatics helps to see the possibilities

We should start developingWe should start developing

Bioinformatics for ProteomicsBioinformatics for Proteomics

NowNow

- To help improve the techniques- To make the most of the data- To prevent drowning in data in the future

- And to really understand all that transcriptomics stuff