26
Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions.

Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

Embed Size (px)

Citation preview

Page 1: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

Biol 729 – Proteome Bioinformatics

Dr M. J. Fisher - Protein: Protein Interactions.

Page 2: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

•In order to fully understand what proteins do, we need to know something about the other proteins that they interact with – their partners.

•Most proteins need a partner to, either allosterically or by covalent modification, cause them to change their conformation /activity.

•Two proteomic methods for predicting protein : protein interactions are DNA sequence analysis and yeast two-hybrid (Y2H) analysis.

Interaction between 7-TM receptor and heterotrimeric () G-protein.

Page 3: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

Comparative genomic analysis can be used to predict protein-protein interactions.

•Suppose species A has a protein with two domains (1 and 2) and species B has two proteins, one (1’) containing domain 1 and the other (2’) containing domain 2. Then it is possible to predict that, in species B, proteins 1’ and 2’ may interact to generate the same function as seen in the single ‘dual domain’ protein in species A.

•This approach has been used to predict many protein: protein interactions in yeast and bacteria.

•This sort of in silico analysis can give valuable insights into protein-protein interaction – but it is limited to the specific situation where a protein encoded by a single gene in one species is replaced by two or more interacting proteins in others.

Page 4: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

Measuring protein-protein interactions.

The yeast two-hybrid (Y2H) method uses a protein of interest as bait in order to discover interacting (or ‘prey’) proteins.

A transcription factor is cut into two pieces – the DNA binding domain (DBD) and the activation domain (AD) which stimulates RNA polymerase to begin transcription. Fused to the DBD is the bait protein (a). Fused to the AD is a prey ORF – which can be any known or unknown protein (b). Neither DND-B nor any of the AD-Ps can, on their own, initiate transcription. When the bait and prey proteins are produced in the same cell, they may interact and transcription can be initiated.

Page 5: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

•i.e. the two domains of the transcription factor (DBD and AD) do NOT need to be transcribed in a single protein – if they are able to interact (as a consequence of the bait: prey interaction) then transcription will occur.

•A typical reporter gene is His3 (which leads to the production of the amino acid histidine). Without His3 activity, cells cannot grow unless histidine is added to the growth medium.

•The chimeric proteins (i.e. DBD-B and AD-P) are made as translational fusions in yeast. Plasmids are made, one in which a bait coding sequence is fused to the DBD domain coding sequence, others in which prey cDNA sequences are fused to the AD coding sequence. Plasmids are then transformed into a suitable yeast strain which allows expression of the individual chimeric proteins.

Page 6: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

•An alternative is to use a lacZ reporter gene, the resulting enzyme -galactosidase generates a blue colour indicating activity and therefore bait:prey interaction.

•Any ORF can be tested with Y2H, which means that a proteome-wide survey can be performed rapidly by transforming a cDNA library into cells that contain bait plasmids.

•In this way, every protein in a proteome can be tested individually for its potential to interact with bait.

lacZ phenotype - blue colonies

Page 7: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

•The Y2H method is not perfect – if the prey ORF was the His3 ORF then the cell would grow in the absence of interaction with the bait.

•As a control, cells containing only prey must be tested on media lacking histidine.

•Also, not all proteins work well inside a cell nucleus and there may be false-positive or false negative results due to improper protein folding.

•The greatest benefit of YH2 is that yeast cells can express genes from almost any species, which means this is a powerful proteomics method for Drosophila, C. elegans, Arabidopsis, zebra fish, mice, humans and also yeast.

A lacZ phenotype on both plates (i.e. a false-positive), is boxed in red. The negative control (i.e. no bait) is boxed in black.

Page 8: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

Automated Y2H Screening

Y2H screening can be done at ‘high throughput’ and quality. The protocol is as follows:

•Combine the bait with the pray library via an optimised mating protocol - use of microtitre plates and laboratory robotics

•Select the ‘positives’ via a quantitative signal

•PCR out the library inserts and analyze them by sequencing and bioinformatics

Page 9: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

Biochemical evidence for interactions between bait and prey proteins must follow-up Y2H–based indications of interactions. This evidence may include:

•Affinity chromatography – link the bait to a gel matrix and use this to specifically bind (and purify) interacting prey from complex protein mixtures.

•Gel overlay assays – similar to Western blots – except that instead of using a specific antibody to probe the membrane, use the bait protein. The pray/bait interaction is then visualised using an antibody to the bait protein.

Page 10: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

•Co-immunoprecipitation – use an anti-bait antibody to immunoprecipitate bait protein from a complex mixture of proteins – prey proteins that bind bait are likely to be co-precipitated.

Page 11: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

Protein Interaction Databases

•Using Y2H approaches, databases of protein interactions have been created.

•The application of automated high-throughput Y2H analysis has lead to a dramatic increase in the number of protein interactions in these databases.

•However, it is likely that only a small fraction of the total number of interactions has, as yet, been identified. Parallel efforts in yeast and Drosophila show little overlap in datasets.

•False positives continue to be a problem – given that the number of interactions that need to be tested by biochemical techniques is overwhelming.

Page 12: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

Computational approaches for verification of Y2H data.

•50%, or more, of the high-throughput Y2H data are likely to be false positives i.e. the results are ‘noisy’.

•Computational methods, designed to test the quality of ‘interaction maps’, have been developed. The basis for most of these strategies is to test for correlation between interaction data and other properties of the proteins, protein networks or the corresponding genes.

•Interactions that are evolutionarily conserved have a higher probability of being biologically relevant than those detected in only a single organism. Similarly, if two proteins implicated in an interaction, have paralogues that also interact, this interaction is of increased likeliness.

•Genes whose encoded proteins interact may be more likely, than random gene pairs, to be transcriptionally co-regulated.

Page 13: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

•Connectivity of protein networks i.e. if protein A interacts with proteins B and C, then the finding that B and C also interact forms a closed loop of three proteins and gives a measure of interaction reliability. Such a group of proteins may form a conserved module reflecting a discrete biological activity. Such modules are often evolutionarily conserved.

•Another approach is to evaluate the functional activities of interacting proteins. Given that a set of interacting proteins is likely to work in the same biological process, common functional annotations for such proteins support the relevance of the interaction.

•Other comparisons can be made between interaction data and the available set of protein structures or protein domains.

Page 14: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

http://dip.doe-mbi.ucla.edu/

Page 15: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

The DIP can be searched in a number of different ways. Here the database has been searched with a protein sequence. The search results show, in addition to Protein Name/Description, a Node and a Links link.

Page 16: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

The Node link gives information about that protein as a ‘node’ in an interacting network. A graphical view of the node is available – using the graph link.

Page 17: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

A list of the interacting proteins is available from the ‘Links’ link in the search result.

Page 18: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

Network graphs

•A network of interacting proteins – a graph.

•The lines between nodes – are called edges or arcs.

•The number of edges touching a node is called the degree of the node.

Page 19: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

The C. elegans interactome – a network of 2898 nodes connected by 5460 edges. The terms core and non-core refer to the ‘confidence’ of the interaction in HT-Y2H screens; interologs are conserved interactions as found by in silico searches and scaffolds are interactions revealed by other partial (i.e. relating to specific biological processes [e.g. protein degradation] ) interactome maps of C. elegans.

Page 20: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

A highly interconnected sub-network around two C. elegans proteins – that are components of a conserved network of transcription factors.

Page 21: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

http://www.droidb.org/Index.jsp

Page 22: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

http://string.embl.de/

Page 23: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions
Page 24: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

Interactome Properties

•A feature of protein networks, that emerges from large-scale approaches, is that the number of links per protein is highly non-uniform, ranging from a few hubs with many connections to the great majority of hubs with only a few connections. i.e. there is a ‘scale-free’ degree distribution.

•Any two proteins can be connected by a path with only a few links (a characteristic of the World Wide Web!). This is the ‘small-world’ property. The evolution of this topology can be explained by the preferential attachment of new nodes to ones that already have many links, in a process related to gene duplication.

i.e. highly connected proteins are more likely to interact with a protein that is duplicated. Thus, highly connected proteins gain even more links.

•Another aspect of protein networks is their robustness, with random loss of proteins mostly affecting the many proteins with only a few partners rather than the small number of hubs. In yeast, deletion of genes encoding highly connected proteins is three times more likely to result in lethal phenotype than deletion of other genes.

Page 25: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

Summary

•In order to fully understand what proteins do, we need to know something about the other proteins that they interact with.

•Comparative genomic analysis can be used to predict protein-protein interactions. Experimentally, the yeast two-hybrid (Y2H) method uses a protein of interest as bait in order to discover interacting (or ‘prey’) proteins.

•Y2H screening can be done at ‘high throughput’ and quality.

•Biochemical evidence (affinity chromatography, gel-overlay assays, co-immunoprecipitations) for interactions between bait and prey proteins must follow-up Y2H–based indications of interactions.

•Computational approaches are used for verification of Y2H data.

•Protein : protein interactions (the Interactome) can be represented by network graphs.

Page 26: Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions

References

Uetz, P. et al., (2000) ‘A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae’. Nature 403, 623-627.

Fields, S. (2005) ‘High-throughput two-hybrid analysis – the promise and the peril’. FEBS Journal, 272, 5391-5399.

Siming, L. et al., (2004) ‘A map of the interactome network of the metazoan C. elegans’. Science 303, 540-543.