54
C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction Lecture 13: Protein Function

C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

Embed Size (px)

Citation preview

Page 1: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

CENTR

FORINTEGRATIVE

BIOINFORMATICSVU

E

Bioinformatics Master Course:DNA/Protein Structure-Function Analysis and Prediction

Lecture 13: Protein Function

Page 2: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[2] [2] [2]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Sequence

Structure

Function

Threading

Homology searching (BLAST)

Ab initio prediction and folding

Function prediction from structure

Sequence-Structure-Function

impossible but for the smallest structures

very difficult

Page 3: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[3] [3] [3]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

TERTIARY STRUCTURE (fold)TERTIARY STRUCTURE (fold)

Genome

Expressome

Proteome

Metabolome

Functional Genomics – Systems Functional Genomics – Systems BiologyBiology

Metabolomics

fluxomics

Page 4: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[4] [4] [4]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Systems Biology

is the study of the interactions between the components of a biological system, and how these interactions give rise to the function and behaviour of that system (for example, the enzymes and metabolites in a metabolic pathway). The aim is to quantitatively understand the system and to be able to predict the system’s time processes

• the interactions are nonlinear• the interactions give rise to emergent properties, i.e.

properties that cannot be explained by the components in the system

• Biological processes include many time-scales, many compartments and many interconnected network levels (e.g. regulation, signalling, expression,..)

Page 5: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[5] [5] [5]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Systems Biology

understanding is often achieved through modeling and simulation of the system’s components and interactions.

Many times, the ‘four Ms’ cycle is adopted:

Measuring

Mining

Modeling

Manipulating

Page 6: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[6] [6] [6]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

‘The silicon cell’

(some people think ‘silly-con’ cell)

Page 7: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[7] [7] [7]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Page 8: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[8] [8] [8]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

A system response

Apoptosis: programmed cell death

Necrosis: accidental cell death

Page 9: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[9] [9] [9]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

This pathway diagram shows a comparison of pathways in (left) Homo sapiens (human) and (right) Saccharomyces cerevisiae (baker’s yeast). Changes in controlling enzymes (square boxes in red) and the pathway itself have occurred (yeast has one altered (‘overtaking’) path in the graph)

We need to be able to do automatic pathway comparison (pathway alignment)

Human Yeast

‘Comparative metabolomics’

Page 10: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[10] [10] [10]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

The citric-acid cycle

http://en.wikipedia.org/wiki/Krebs_cycle

Page 11: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[11] [11] [11]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

The citric-acid cycleFig. 1. (a) A graphical representation of the reactions of the citric-acid cycle (CAC), including the connections with pyruvate and phosphoenolpyruvate, and the glyoxylate shunt. When there are two enzymes that are not homologous to each other but that catalyse the same reaction (non-homologous gene displacement), one is marked with a solid line and the other with a dashed line. The oxidative direction is clockwise. The enzymes with their EC numbers are as follows: 1, citrate synthase (4.1.3.7); 2, aconitase (4.2.1.3); 3, isocitrate dehydrogenase (1.1.1.42); 4, 2-ketoglutarate dehydrogenase (solid line; 1.2.4.2 and 2.3.1.61) and 2-ketoglutarate ferredoxin oxidoreductase (dashed line; 1.2.7.3); 5, succinyl- CoA synthetase (solid line; 6.2.1.5) or succinyl-CoA–acetoacetate-CoA transferase (dashed line; 2.8.3.5); 6, succinate dehydrogenase or fumarate reductase (1.3.99.1); 7, fumarase (4.2.1.2) class I (dashed line) and class II (solid line); 8, bacterial-type malate dehydrogenase (solid line) or archaeal-type malate dehydrogenase (dashed line) (1.1.1.37); 9, isocitrate lyase (4.1.3.1); 10, malate synthase (4.1.3.2); 11, phosphoenolpyruvate carboxykinase (4.1.1.49) or phosphoenolpyruvate carboxylase (4.1.1.32); 12, malic enzyme (1.1.1.40 or 1.1.1.38); 13, pyruvate carboxylase or oxaloacetate decarboxylase (6.4.1.1); 14, pyruvate dehydrogenase (solid line; 1.2.4.1 and 2.3.1.12) and pyruvate ferredoxin oxidoreductase (dashed line; 1.2.7.1).

M. A. Huynen, T. Dandekar and P. Bork ``Variation and evolution of the citric acid cycle: a genomic approach'' Trends Microbiol, 7, 281-29 (1999)

Page 12: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[12] [12] [12]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

The citric-acid cycle

M. A. Huynen, T. Dandekar and P. Bork ``Variation and evolution of the citric acid cycle: a genomic approach'' Trends Microbiol, 7, 281-29 (1999)

b) Individual species might not have a complete CAC. This diagram shows the genes for the CAC for each unicellular species for which a genome sequence has been published, together with the phylogeny of the species. The distance-based phylogeny was constructed using the fraction of genes shared between genomes as a similarity criterion29. The major kingdoms of life are indicated in red (Archaea), blue (Bacteria) and yellow (Eukarya). Question marks represent reactions for which there is biochemical evidence in the species itself or in a related species but for which no genes could be found. Genes that lie in a single operon are shown in the same color. Genes were assumed to be located in a single operon when they were transcribed in the same direction and the stretches of non-coding DNA separating them were less than 50 nucleotides in length.

Page 13: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[13] [13] [13]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Experimental• Structural genomics

• Functional genomics

• Protein-protein interaction

• Metabolic pathways

• Expression data

Page 14: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[14] [14] [14]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Communicability: Functional Genomics• Interpretation of genome-scale gene expression data

External ProgramDNA-chip data

Cluster of coregulated genes gene 1 gene 2 ... gene n

PFMP query

Pathways affected pathway 1 pathway 2

Page 15: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[15] [15] [15]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Communicability: Functional Genomics• Interpretation of genome-scale gene expression data

External ProgramsDNA-chip data

Cluster of coregulated genes gene 1 gene 2 ... gene n

PFMP query

Similarities with known regulatory sites site 1 Factor 1 site 2 Factor 2 ...

Pattern discovery gene 1 gene 2 ...(putative regulatory sites)

Page 16: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[16] [16] [16]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Other Issues• Partial information (indirect interactions) and

subsequent filling of the missing steps

• Negative results (elements that have been shown not to interact, enzymes missing in an organism)

• Putative interactions resulting from computational analyses

Page 17: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[17] [17] [17]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Protein function categories• Catalysis (enzymes)

• Binding – transport (active/passive)

• Protein-DNA/RNA binding (e.g. histones, transcription factors)

• Protein-protein interactions (e.g. antibody-lysozyme) (experimentally determined by yeast two-hybrid (Y2H) or bacterial two-hybrid (B2H) screening )

• Protein-fatty acid binding (e.g. apolipoproteins)

• Protein – small molecules (drug interaction, structure decoding)

• Structural component (e.g. -crystallin)

• Regulation

• Signalling

• Transcription regulation

• Immune system

• Motor proteins (actin/myosin)

Page 18: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[18] [18] [18]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Catalytic properties of enzymes

[S]

Mol

es/s

Vmax

Vmax/2

Km

Michaelis-Menten equation:

Km kcat

E + S ES E + P• E = enzyme• S = substrate• ES = enzyme-substrate complex (transition state)• P = product

• Km = Michaelis constant

• Kcat = catalytic rate constant (turnover number)

• Kcat/Km = specificity constant (useful for comparison)

Vmax × [S]V = ------------------- Km + [S]

Page 19: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[19] [19] [19]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Protein interaction domains

http://pawsonlab.mshri.on.ca/html/domains.html

Page 20: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[20] [20] [20]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Energy difference upon binding

Examples of protein interactions (and functional importance) include:

• Protein – protein (pathway analysis);

• Protein – small molecules (drug interaction, structure decoding);

• Protein – peptides, DNA/RNA (function analysis) 

The change in Gibb’s Free Energy of the protein-ligand binding interaction can be monitored and expressed by the following; 

G = H – T S        (H=Enthalpy, S=Entropy and T=Temperature)

Page 21: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[21] [21] [21]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Protein function • Many proteins combine functions

• Some immunoglobulin structures are thought to have more than 100 different functions (and active/binding sites)

• Alternative splicing can generate (partially) alternative structures

Page 22: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[22] [22] [22]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Protein function & Interaction

Active site / binding cleft

Shape complementarity

Page 23: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[23] [23] [23]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Protein function evolution

Chymotrypsin

Page 24: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[24] [24] [24]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

How to infer function• Experiment

• Deduction from sequence• Multiple sequence alignment – conservation patterns• Homology searching

• Deduction from structure• Threading• Structure-structure comparison• Homology modelling

Page 25: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[25] [25] [25]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Cholesterol Biosynthesis:

Cholesterol biosynthesis primarily occurs in eukaryotic cells. It is necessary for membrane synthesis, and is a precursor for steroid hormone production as well as for vitamin D. While the pathway had previously been assumed to be localized in the cytosol and ER, more recent evidence suggests that a good deal of the enzymes in the pathway exist largely, if not exclusively, in the peroxisome (the enzymes listed in blue in the pathway to the left are thought to be at least partly peroxisomal). Patients with peroxisome biogenesis disorders (PBDs) have a variable deficiency in cholesterol biosynthesis

Page 26: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[26] [26] [26]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

EMevalonate plays a role in epithelial cancers: it can inhibit EGFR

Cholesterol Biosynthesis: from acetyl-Coa to mevalonate

Page 27: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[27] [27] [27]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Epidermal Growth Factor as a Clinical Target in CancerA malignant tumour is the product of uncontrolled cell proliferation. Cell growth is controlled by a delicate balance between growth-promoting and growth-inhibiting factors. In normal tissue the production and activity of these factors results in differentiated cells growing in a controlled and regulated manner that maintains the normal integrity and functioning of the organ. The malignant cell has evaded this control; the natural balance is disturbed (via a variety of mechanisms) and unregulated, aberrant cell growth occurs. A key driver for growth is the epidermal growth factor (EGF) and the receptor for EGF (the EGFR) has been implicated in the development and progression of a number of human solid tumours including those of the lung, breast, prostate, colon, ovary, head and neck.

Page 28: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[28] [28] [28]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Energy housekeeping:Adenosine diphosphate (ADP) – Adenosine triphosphate (ATP)

Page 29: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[29] [29] [29]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Chemical Reaction

Page 30: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[30] [30] [30]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Enzymatic Catalysis

Page 31: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[31] [31] [31]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Gene Expression

Page 32: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[32] [32] [32]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Inhibition

Page 33: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[33] [33] [33]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Metabolic Pathway: Proline Biosynthesis

Page 34: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[34] [34] [34]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Transcriptional Regulation

Page 35: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[35] [35] [35]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Methionine Biosynthesis in E. coli

Page 36: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[36] [36] [36]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Shortcut Representation

Page 37: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[37] [37] [37]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

High-level Interaction

Page 38: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[38] [38] [38]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Levels of Resolution

Page 39: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[39] [39] [39]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Cholesterol Biosynthesis

Page 40: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[40] [40] [40]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

SREBP Pathway

Page 41: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[41] [41] [41]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Signal Transduction

Important signalling pathways: Map-kinase (MapK) signalling pathway, or TGF- pathway

Page 42: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[42] [42] [42]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Transport

Page 43: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[43] [43] [43]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Phosphate Utilization in Yeast

Page 44: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[44] [44] [44]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Multiple Levels of Regulation• Gene expression

• Protein activity

• Protein intracellular location

• Protein degradation

• Substrate transport

Page 45: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[45] [45] [45]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Graphical Representation – Gene Expression

Page 46: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[46] [46] [46]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Experimental Data – Gene Expression

Page 47: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[47] [47] [47]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Experimental Data – Transcriptional Regulation

Page 48: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[48] [48] [48]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Experimental Data – Transcriptional Regulation

Page 49: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[49] [49] [49]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Transcriptional RegulationIntegrated View

Page 50: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[50] [50] [50]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Pathways and Pathway Diagrams• Pathways

• Set of nodes (entities) and edges (associations)

• Pathway Diagrams

• XY coordinates

• Node splitting allowed

• Multiple views of the same pathway

• Different abstraction levels

Page 51: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[51] [51] [51]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Kegg database (Japan)

Metabolic Metabolic networksnetworks

Glycolysis Glycolysis and and

GluconeogenesisGluconeogenesis

Page 52: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[52] [52] [52]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Gene Ontology (GO)• Not a genome sequence database

• Developing three structured, controlled vocabularies (ontologies) to describe gene products in terms of:

• biological process

• cellular component

• molecular function

in a species-independent manner

Page 53: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[53] [53] [53]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

The GO ontology

Page 54: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Bioinformatics Master Course: DNA/Protein Structure-Function Analysis and Prediction

[54] [54] [54]

C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U

E

Gene Ontology Members• FlyBase - database for the fruitfly Drosophila melanogaster • Berkeley Drosophila Genome Project (BDGP) - Drosophila informatics; GO database & software, Sequence Ontology development

• Saccharomyces Genome Database (SGD) - database for the budding yeast Saccharomyces cerevisiae • Mouse Genome Database (MGD) & Gene Expression Database (GXD) - databases for the mouse Mus musculus

• The Arabidopsis Information Resource (TAIR) - database for the brassica family plant Arabidopsis thaliana

• WormBase - database for the nematode Caenorhabditis elegans • EBI GOA project : annotation of UniProt (Swiss-Prot/TrEMBL/PIR) and InterPro databases • Rat Genome Database (RGD)  - database for the rat Rattus norvegicus • DictyBase  - informatics resource for the slime mold Dictyostelium discoideum • GeneDB S. pombe - database for the fission yeast Schizosaccharomyces pombe (part of the Pathogen Sequencing Unit at the Wellcome Trust Sanger Institute)

• GeneDB for protozoa - databases for Plasmodium falciparum, Leishmania major, Trypanosoma brucei, and several other protozoan parasites (part of the Pathogen Sequencing Unit at the Wellcome Trust Sanger Institute)

• Genome Knowledge Base (GK) - a collaboration between Cold Spring Harbor Laboratory and EBI) • TIGR - The Institute for Genomic Research • Gramene - A Comparative Mapping Resource for Monocots • Compugen (with its Internet Research Engine) • The Zebrafish Information Network (ZFIN) - reference datasets and information on Danio rerio