34
MED260 Modeling Protein Function - Octo ber 1 1, 2006 1 Modeling Protein Function MED260 Philip E. Bourne Department of Pharmacology, UCSD  [email protected] http://www.sdsc.edu/pb Slides on-line at: http://www.sdsc.edu/pb/edu/med260/med260.ppt

Protein Function Prediction Studies Ppts

Embed Size (px)

Citation preview

Page 1: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 1/34

MED260 Modeling Protein Function- October 11, 2006 1

Modeling Protein FunctionMED260

Philip E. BourneDepartment of Pharmacology, UCSD [email protected]

http://www.sdsc.edu/pbSlides on-line at:

http://www.sdsc.edu/pb/edu/med260/med260.ppt

Page 2: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 2/34

MED260 Modeling Protein Function- October 11, 2006 2

AgendaWhy model protein function?Where does it fit as a technique in modern medical

research?

The data deluge as a motivator The extent of what can be modeledOntologies ± establishing order from chaos

Examples of what can be learntAccuracy ± a word of caution

Page 3: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 3/34

MED260 Modeling Protein Function- October 11, 2006 3

Why Model Protein Function

The rate of discovery of new proteins far outweighs our ability to functionally characterizethemFunctional discovery of new proteins has

implications in: ± Drug discovery

± Biomarker identification ± Understanding of biological processes ± Identification of disease states and treatment regimes

Why model protein function?

Page 4: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 4/34

C ell BiologyC ell Biology

AnatomyAnatomy

PhysiologyPhysiology

ProteomicsProteomicsGenomicsGenomics

MedicinalMedicinalC hemistryC hemistry

OrganismsOrganisms

OrgansOrgans

CellsCells

MacromoleculesMacromoleculesBiopolymersBiopolymers

Atoms & Molecules Atoms & Molecules

SCIENTIFIC RESEARCH& DISCOVERY

REPRESENTATIVEDISCIPLINE

EXAMPLEUNITS

MRIMRI

H eartH eart

NeuronNeuron

StructureStructureSequenceSequence

ProteaseProteaseInhibitorInhibitor

ElectronElectronMicroscopyMicroscopy

MigratoryMigratorySensorsSensors

VentricularVentricularModelingModeling

XX--rayrayC rystallographyC rystallography

ProteinProteinDockingDocking

REPRESENTATIVETECHNOLOGY

Where does it fit as a techniquein modern medical research?

Page 5: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 5/34

C ell BiologyC ell Biology

AnatomyAnatomy

PhysiologyPhysiology

ProteomicsProteomicsGenomicsGenomics

MedicinalMedicinalC hemistryC hemistry

OrganismsOrganisms

OrgansOrgans

CellsCells

MacromoleculesMacromoleculesBiopolymersBiopolymers

Atoms & Molecules Atoms & Molecules

SCIENTIFIC RESEARCH& DISCOVERY

REPRESENTATIVEDISCIPLINE

EXAMPLEUNITS

MRIMRI

H eartH eart

NeuronNeuron

StructureStructureSequenceSequence

ProteaseProteaseInhibitorInhibitor

ElectronElectronMicroscopyMicroscopy

MigratoryMigratorySensorsSensors

VentricularVentricularModelingModeling

XX--rayrayC rystallographyC rystallography

ProteinProteinDockingDocking

REPRESENTATIVETECHNOLOGY

Translational

Medicine

Where does it fit as a techniquein modern medical research?

Page 6: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 6/34

MED260 Modeling Protein Function- October 11, 2006 6

The Ability to Model Protein FunctionInfluences and can be Influenced by Any

Level of Biological Complexity - ExamplesGenome - rapid increase in sequenced genomes providesnew raw material

Proteome ± large increase in the number of 3D structureshighlights new functions

Interactome ± identification of a binding partner points toa new function

Metabolome ± isolation of a protein within a metabolic pathwayCell - localization points to functionOrgan ± gene expression in heart tissue points to functionOrganism ± different physiology observed in species can be related to protein functions

Where does it fit as a techniquein modern medical research?

Page 7: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 7/34

MED260 Modeling Protein Function- October 11, 2006 7

C ell BiologyC ell Biology

AnatomyAnatomy

PhysiologyPhysiology

ProteomicsProteomicsGenomicsGenomics

MedicinalMedicinalC hemistryC hemistry

OrganismsOrganisms

OrgansOrgans

CellsCells

MacromoleculesMacromoleculesBiopolymersBiopolymers

Atoms & Molecules Atoms & Molecules

SCIENTIFIC RESEARCH& DISCOVERY

REPRESENTATIVEDISCIPLINE

EXAMPLEUNITS

MRIMRI

H eartH eart

NeuronNeuron

StructureStructureSequenceSequence

ProteaseProteaseInhibitorInhibitor

ElectronElectronMicroscopyMicroscopy

MigratoryMigratorySensorsSensors

VentricularVentricularModelingModeling

XX--rayrayC rystallographyC rystallography

ProteinProteinDockingDocking

REPRESENTATIVETECHNOLOGY

We will focus here

Page 8: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 8/34

At All Levels We Are Being Driven By Data

Biological Experiment Data Information Knowledge Discovery

Collect Characterize Compare Model Infer

Sequence

Structure

Assembly

Sub-cellular

Cellular Organ

Higher-life

Year 90 05

ComputingPower

SequencingTechnology

Data1 10 100 1000 100000

95 00

HumanGenomeProject

E.ColiGenome

C.ElegansGenome 1 Small

Genome/Mo.ESTs

YeastGenome

Gene Chips

VirusStructure Ribosome

Model MetaboloicPathway of E.coli

Complexity Technology

BrainMapping

GeneticCircuits

NeuronalModeling

CardiacModeling

HumanGenome

# People/Web Site106 102 1

VirtualCommunities

T he Data Deluge

Page 9: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 9/34

MED260 Modeling Protein Function- October 11, 2006 9

Metagenomics A First Look New type of genomics New data (and lots of it)and new types of data

± 17M new (predicted proteins!) 4-5 x growth

in just few months andmuch more coming ± New challenges and

exacerbation of oldchallenges

T he Data Deluge

Page 10: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 10/34

MED260 Modeling Protein Function- October 11, 2006 10

Metagenomics: First Results

More then 99.5% of DNAin very environmentstudied represent unknownorganisms ± Culturable organisms are

exceptions, not the ruleMost genes represent

distant homologs of knowngenes, but there arethousands of new families

Everything we touchturns out to be a gold

mineEnvironments studied: ± Water (ocean, lakes) ± Soil

± Human body (gut, oralcavity, humanmicrobiome)

T he Data Deluge

Page 11: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 11/34

MED260 Modeling Protein Function- October 11, 2006 11

Metagenomics New DiscoveriesEnvironmental (red) vs. Currently Known PTPases (blue)

1

T he Data Deluge

Page 12: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 12/34

MED260 Modeling Protein Function- October 11, 2006 12

The Good News and the Bad News

Good news ± Data pointing towards function are growing at

near exponential rates ± IT can handle it on a per dollar basisBad news ± Data are growing at near exponential rates ± Quality is highly variable ± A ccurate functional annotation is sparse

T he Data Deluge

Page 13: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 13/34

MED260 Modeling Protein Function- October 11, 2006 13

Genomes - 2004We all know about the human ± what is not

so well known is: ± 191 completed microbial genomes ± 44 archaea ± 727 bacteria

± 785 eukaryotes (complete or in progress) ± Viroids «.

T he Data Deluge

Page 14: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 14/34

MED260 Modeling Protein Function- October 11, 2006 14

ProteomeWe are reasonably good at finding proteins

in genomes with intergenic regions but not perfect ± eg alternative initiation codons

Regulatory elements provide a different setof challengesWe are not so good at assigning functions to

those proteinsM oreover the devil is in the details

T he Extent of What Can Be M odeled

Page 15: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 15/34

MED260 Modeling Protein Function- October 11, 2006 15

Estimated Functional Roles (by % of Proteins)of the Proteome in a Complex Organism

T he Extent of What Can Be M odeled

Page 16: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 16/34

MED260 Modeling Protein Function- October 11, 2006 16

Functional Nomenclature Needs to be C onsistentfor Orderly Progress ± Enter E C and GO

EC classifies all enzymes -http://www.chem.qmul.ac.uk/iubmb/enzyme/Gene Ontology Consortium characterizes

by molecular function, biochemiscal process and cellular locationhttp://www.geneontology.org/

O ntologies ± establishing order from chaos

Page 17: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 17/34

Functional

Coverage of theHuman Genome

http://function.rcsb.org:8080/pdb/function_distribution/index.html

40% covered

T he Extent of What Can Be M odeled

Page 18: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 18/34

MED260 Modeling Protein Function- October 11, 2006 18

Step 1. Learn What You Can fromthe Protein Sequence

Find it

Pay attention to the quality of the functionalannotation ± errors are transitive

Understand its 1-D structure ± domainorganization, {signatures, fingerprints}

Examples of what can be learnt

Page 19: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 19/34

MED260 Modeling Protein Function- October 11, 2006 19

Step 2. Is there a 3D Structure? If soWhat Can You Learn from That?

Find itUnderstand it

Characterize itUnderstand its function(s) ± these follow a

power law at the fold level ± some folds are

promiscuous (many functions) others aresolitary or of unknown function

Examples of what can be learnt

Page 20: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 20/34

(a) myoglobin (b) hemoglobin (c) lysozyme (d) transfer RNA(e) antibodies (f) viruses (g) actin (h) the nucleosome(i) myosin (j) ribosome Courtesy of David Goodsell, TSRI

Page 21: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 21/34

MED260 Modeling Protein Function- October 11, 2006 21

First Why Bother with Structure?An Example: Protein Kinase A

This ´molecular sceneµfor cAMP dependant protein kinase depicts years of collective

knowledge.

Beyond basics, only the atomic coordinates are captured by the PDB.

Functional annotationrequires the literature

Examples of what can be learnt

Page 22: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 22/34

MED260 Modeling Protein Function- October 11, 2006 22

What Did that Picture Tell Us?Two domains with

associated functions

ATP binding & substrate bindingThrough conserved

residues and their spatiallocation details of the ATPand substrate binding andmechanism of the phosphotransfer reaction

So is structure

the answer tofunctionalmodeling?

Examples of what can be learnt

Page 23: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 23/34

MED260 Modeling Protein Function- October 11, 2006 23

Question: So is structure the answer to

functional modeling?Answer: Partly - The number of unique

protein sequences still outnumbers thenumber of unique structures by 100:1

Enter Structural Genomics

Enter Structure Prediction

Examples of what can be learnt

Page 24: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 24/34

MED260 Modeling Protein Function- October 11, 2006 24

The Structural Genomics Pipeline(X-ray Crystallography)Basic Steps

Target

Selection

Crystallomics Isolation, Expression,

Purification, Crystallization

DataCollection

StructureSolution

StructureRefinement

FunctionalAnnotation Publish

Examples of what can be learnt

Page 25: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 25/34

MED260 Modeling Protein Function- October 11, 2006

25

Structural Genomics Will Give Us..

Good news ± More structures (definitely)

± New folds (some but not as anticipated) ± New understanding of specific diseases and pathways(maybe)

± Representatives from each major protein family(maybe)

Bad news ± Many new structures that are functionally unclassified

(definitely)

Examples of what can be learnt

Page 26: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 26/34

MED260 Modeling Protein Function- October 11, 2006

26

What About Structure Prediction?

Current rule

We will be able to predict a structure when weknow all the structures

Examples of what can be learnt

Page 27: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 27/34

MED260 Modeling Protein Function- October 11, 2006

27

R andom 1000 structurally similar PDB polypeptide chains with z > 4.5(% sequence identity vs alignment length)

Twilight Zone

Why is Structure Prediction so Hard?

Midnight Zone

Examples of what can be learnt

Page 28: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 28/34

MED260 Modeling Protein Function- October 11, 2006

28

Approaches to Structure PredictionHomology modelingThreading (aka fold recognition)Ab initioHow well do we do? ± see CASP

Consensus servers ± Eva - http://cubic.bioc.columbia.edu/eva/ ± LiveBench - http://bioinfo.pl/meta/

Examples of what can be learnt

Page 29: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 29/34

MED260 Modeling Protein Function- October 11, 2006

29

Step 3. What Can Be Got from StructureWhen You Have it?

F rom Structural Bioinformatics Ed Bourne and Weissig p394 Wiley 2002

Examples of what can be learnt

Page 30: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 30/34

MED260 Modeling Protein Function- October 11, 2006

30

Specific Example

Mj0577 ± putative ATP molecular switch

Mj0577 is an open reading frame (ORF) of previously unknown function

from M

ethanococcus jannaschii . Its structure was determined at 1.7Å(Figure 7a) (Zarembinskiet al , 1998). The structure contains a boundATP molecule, picked up from the E. coli host. The presence of boundATP led to the proposition that Mj0577 is either an ATPase, or anATP-binding molecular switch. Further experimental work showedthat Mj0577 cannot hydrolyse ATP by itself, and can only do so in the presence of M. jannaschii crude cell extract. Therefore it is more

likely to act as a molecular switch, in a process analogous to ras-GTPhydrolysis in the presence of GTPase activating protein.

F rom Structural Bioinformatics Ed Bourne and Weissig p402 Wiley 2002

Examples of what can be learnt

Page 31: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 31/34

MED260 Modeling Protein Function- October 11, 2006

31

Step 4. Proteins Do Not Function in IsolationBut are Part of Complex Interaction Networks

http://www.genome.jp/kegg/

Examples of what can be learnt

Page 32: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 32/34

MED260 Modeling Protein Function- October 11, 2006

32

Accuracy - A Word of Caution

Errors are transitive ± Proteins A and B are observed to have similar

functions through sequence homology ± Proteins B and C are observed to have similar

functions through sequence homology ± Is protein A related to protein C?

± Up to 30% of current annotation may be wrong

A ccuracy - A Word of Caution

Page 33: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 33/34

MED260 Modeling Protein Function- October 11, 2006

33

Questions?

Page 34: Protein Function Prediction Studies Ppts

8/8/2019 Protein Function Prediction Studies Ppts

http://slidepdf.com/reader/full/protein-function-prediction-studies-ppts 34/34

MED260 Modeling Protein Function- October 11, 2006

34

Demo of Steps 1-4Step 1. Learn What You Can from the Protein

Sequence

Step 2. Is there a 3D Structure? If So, What CanYou Learn from That?Step 3. What Can Be Got from Structure When

You Have it?Step 4. Proteins Do Not Function in Isolation But

are Part of Complex Interaction Networks