ELM PHOSPHO.ELM. June 2009 bioinformatics workshop. Irit Kisslov Noam Harif. E ukaryotic L inear M otif. Collection of small functional sites/motives. Examples: - protein interaction sites - cell compartment targeting signals - post-translational modifications sites - PowerPoint PPT Presentation
Text of ELM
Eukaryotic Linear MotifCollection of small functional sites/motives.Examples: - protein interaction sites- cell compartment targeting signals- post-translational modifications sites- cleavage sites
Functional sites vs. globular domainsDifficult to analyse productivelyWhy?
There are many of them.They are short.No defined structure.False positive results.
ELM aims:Creation of a comprehensive database of eukaryotic linear motifsProviding a resource to aid in ELM discovery
Phospho.ELM version 7.0
INSTANCE INFORMATIONSubstrate Protein Additional information about the substrate protein, which contains the phosphorylation site, pattern of expression, interaction partners and interaction network(s) (where available).Accession Number Link to UniPROT/Ensembl entry.Short Description Brief description of the substrate protein.Interaction Network Links to STRING to allow retrieval of interacting genes/proteins and\or NetworKIN to allow the user to browse/search which kinase families are predicted to phosphorylate experimentally identified phosphorylation sites in vivo.Position Position of the S/T/Y phosphorylation site within the UniPROT/TrEMBL sequence. (NB. position may differ from that given in literature due to differences in sequence entry).Target Sequence Amino acid sequence of region flanking the modified residue (+/- 10 amino acids).Kinase List of kinases which modify the given residue.PubMed Link to PubMed entry(s) for publications reporting the evidence from which the data was collected. (NB. Site position quoted in literature may differ from that given in Phospho.ELM due to differences in sequence entry and annotation).Source High Throughput Data (HTP); Low Throughput Data (LTP)Binding motif The phosphorylation of the given residue creates a binding motif for a domain involved in signaling (e.g. SH2, 14-3-3, PTB).PDB/MSD Link to a macromolecular structure database MSD-EBI entry containing a relevant structure covering the phosphorylated site.SMART Note of any SMART domain predictions in which the phosphorylation instance resides.ELM_ID Link to ELM server entry for the given phosphorylation and/or binding motif.SUBSTRATE INFORMATION Compartment Gene Ontology term for cell compartment in which the substrate is found.Expression List of tissues in which substrate is known to be expressed.Interaction Network Links to schematic representation of a signaling pathway in which the substrate is thought to be involved and/or links to Biocarta.Interaction Partner Links to the Molecular INTeractions database for the interaction partners of the substrate.
Phospho.BLAST Search Help Phospho.BLAST reports peptides in submitted sequences that match to phosphopeptides in the Phospho.ELM database. The database entries are experimentally demonstrated to contain phosphorylated residues as published in the scientific literature.It is important to note that Phospho.BLAST does NOT PREDICT phosphorylation motifs in the query protein. Short peptide matches found by BLAST do not have meaningful significance values. Retrieval by sequence similarity is complementary to retrieval by keyword. It is likely to be useful for retrieving phosphorylation sites in a protein of interest (e.g. when keywords may be poorly defined) or for retrieving phosphorylation sites that are conserved in related proteins (whether orthologues or paralogues). Occasionally, phosphorylation sites in Phospho.ELM are also matched by unrelated query proteins: these are likely to be found interesting by the user. It is up to the user to consider carefully whether there is any meaning (e.g. shared kinase and/or phosphopeptide binding domain specificities) to the match. Since the match is not significant per se, the biological context should be reviewed to help the user decide whether experimental verification is worth attempting. To submit a Phospho.BLAST search against the Phospho.ELM data set of known phosphorylated peptides, enter an UniProt ID or ACC or the raw protein sequence.The Graphic Output:The graphic output of the query sequence shows a summary of the results: the detected SMART domains and phosphopeptide matches. Clicking on the matches bar (in black), one will go directly to the alignment details.The Results Table:The results table shows the phosphopeptide alignments that have been detected. The sorting of the alignments is according to the position on the query sequence and NOT according to e-value. It is not possible to show the e-value given that these are not significant for small peptides. Full length matches are 11 amino acids centered on the phosphopeptide. Shorter matches are either N- and C-terminal sites or truncated by the BLAST algorithm. Phospho.BLAST uses settings for BLAST, optimised for short peptide matches.
How to improve ELM as to be useful prediction tool?Building algorithm for prediction using ELM database.Searching for the known motifs in given protein.Evaluation of the results found by ELM.
: acssion number ID swiss-prot/TrEMBLPROSITEAccession name ELM , , , , , ( ) . : - . . 100 . .Binding motif The phosphorylation of the given residue creates a binding motif for a domain involved in signaling (e.g. SH2, 14-3-3, PTB).Binding motif The phosphorylation of the given residue creates a binding motif for a domain involved in signaling (e.g. SH2, 14-3-3, PTB)
The entries, manually annotated and based on scientific literature, provide information about the phosphorylated proteins and the exact position of known phosphorylated instances, the kinases responsible for the modification and links to bibliographic references. Additional information is also given about structure, interaction partners and sub-cellular compartment and tissue specificities when available. The data set that we provide you with includes: UniPROT/Ensembl accession number, sequence, position, phosphorylated residue, PubMed Id, the upstream kinase (when known), source (High-ThroughPut/Low-ThroughPut) and entry date. The sequences are either from the UniProt Knowledgebase Release 12.3 or Ensembl Release 46.
p53 (also known as protein 53 or tumor protein 53), is a transcription factor which in humans is encoded by the TP53 gene. p53 is important in multicellular organisms, where it regulates the cell cycle and thus functions as a tumor suppressor that is involved in preventing cancer. As such, p53 has been described as "the guardian of the genome," "the guardian angel gene," and the "master watchman," referring to its role in conserving stability by preventing genome mutation.The name p53 is in reference to its apparent molecular mass: it runs as a 53 kilodalton (kDa) protein on SDS-PAGE. But based on calculations from its amino acid residues, p53's mass is actually only 43.7kDa. This difference is due to the high number of proline residues in the protein which slow its migration on SDS-PAGE, thus making it appear heavier than it actually is. This effect is observed with p53 from a variety of species, including humans, rodents, frogs, and fish
*String: NetworKIN predicts which protein kinases might target experimentally identified phosphorylation sites in vivo.
*STRING - *CDK2*MINT : mulecular interaction database*domain predictions in which the phosphorylation instance resides.do not have meaningful significance valuesalignments is according to the position on the query sequence and NOT according to e-value