75

out

Embed Size (px)

DESCRIPTION

out. m e m b r a n e. in. EMBO Workshop, Cape Town, 2014. 1. Tools-> Sequence -> Multialign Viewer. 2. Choose “PF03595_seed.txt”. 3. Select Aligned FASTA. 4. Structure -> Render by Conservation. EMBO Workshop, Cape Town, 2014. out. out. out. out. - PowerPoint PPT Presentation

Citation preview

Slide 1

outinmembraneEMBO Workshop, Cape Town, 2014EMBO Workshop, Cape Town, 2014Tools-> Sequence -> Multialign Viewer1.Choose PF03595_seed.txt2.Select Aligned FASTA3.Structure -> Render by Conservation4.

outoutoutoutEMBO Workshop, Cape Town, 2014

outoutoutoutEMBO Workshop, Cape Town, 2014Actions-> Ribbon -> hide1.

EMBO Workshop, Cape Town, 2014Chen et al. Nature 467 (2010)

EMBO Workshop, Cape Town, 2014Chen et al. Nature 467 (2010)

EMBO Workshop, Cape Town, 2014Chen et al. Nature 467 (2010)

H. influenzae protein structureTUM, January 2013Functional hypothesis via homology to SLAC1Identification potential functional residues using sequence conservation across the family and structural knowledgeSuggested experiments to test functional hypothesis

Thomine and Barbier-Brygoo Nature 467:1058-59 (2010)

Anion channelsexfoliative toxinsmalate uptake transportersulphite efflux pumpEMBO Workshop, Cape Town, 2014Chen et al. Nature 467 (2010)

EMBO Workshop, Cape Town, 2014Example(using homology for protein annotation)>proteinMESRSSPRLECSGAISAHCSLHLPDSSDFQLIFVFLVEMGFHHVGQAGLELLISSDLPTSASQSAGITDMKLSMKNNIINTQQSFVTMPNVIVPDIEKEIRRMENGACSSFSEDDDSASTSEESENENPHARGSFSYKSLRKGGPSQREQYLPGAIALFNVNNSSNKDQEPEEKKKKKKEKKSKSDDKNENKNDPEKKKKKKDKEKKKKEEKSKDKKEEEKKEVVVIDPSGNTYYNWLFCITLPVMYNWTMVIARACFDELQSDYLEYWLILDYVSDIVYLIDMFVRTRTGYLEQGLLVKEELKLINKYKSNLQFKLDVLSLIPTDLLYFKLGWNYPEIRLNRLLRFSRMFEFFQRTETRTNYPNIFRISNLVMYIVIIIHWNACVFYSISKAIGFGNDTWVYPDINDPEFGRLARKYVYSLYWSTLTLTTIGETPPPVRDSEYVFVVVDFLIGVLIFATIVGNIGSMISNMNAARAEFQARIDAIKQYMHFRNVSKDMEKRVIKWFDYLWTNKKTVDEKEVLKYLPDKLRAEIAINVHLDTLKKVRIFADCEAGLLVELVLKLQPQVYSPGDYICKKGDIGREMYIIKEGKLAVVADDGVTQFVVLSDGSYFGEISILNIKGSKAGNRRTANIKSIGYSDLFCLSKDDLMEALTEYPDAKTMLEEKGKQILMKDGLLDLNIANAGSDPKDLEEKVTRMEGSVDLLQTRFARILAEYESMQQKLKQRLTKVEKFLKPLIDTEFSSIEGPGAESGPIDSTEMBO Workshop, Cape Town, 2014

EMBO Workshop, Cape Town, 2014What is the function of our protein?

1MBNEMBO Workshop, Cape Town, 2014NColour Scheme:C

EMBO Workshop, Cape Town, 20141HW2NColour Scheme:C

EMBO Workshop, Cape Town, 2014NColour Scheme:C1HW2

1FOKEMBO Workshop, Cape Town, 2014

1HW2NColour Scheme:C

1FOKEMBO Workshop, Cape Town, 2014

1HW2NColour Scheme:C

Z-score= 4.0%id= 8%RMSD= 2.7EMBO Workshop, Cape Town, 2014

1FOKEMBO Workshop, Cape Town, 2014

1HW2NColour Scheme:C

1FOKEMBO Workshop, Cape Town, 2014

1HW2?Restriction endonuclease

1FOKEMBO Workshop, Cape Town, 2014

1HW2?Restriction endonuclease

1FOKEMBO Workshop, Cape Town, 2014

1HW2Transcription factorRestriction endonucleaseDefinition (Wikipedia):A protein domain is a conserved part of a given protein sequence and structure that can evolve, function, and exist independently of the rest of the protein chain. Each domain forms a compact three-dimensional structure and often can be independently stable and folded.EMBO Workshop, Cape Town, 2014

Human methionine aminopeptidase 2EMBO Workshop, Cape Town, 2014Vogel et al. Curr. Opin. Struct. Biol. 14 (2004) Transcription factorRestriction endonucleaseAlso homologous domains can be functionally diverse(b) FokI is a member of an unusual class of bipartite restriction enzymes that recognize a specific DNA sequence and cleave DNA nonspecifically a short distance away from that sequence. Because of its unusual bipartite nature, FokI has been used to create artificial enzymes with new specificities. We have determined the crystal structure at 2.8A resolution of the complete FokI enzyme bound to DNA. As anticipated, the enzyme contains amino- and carboxy-terminal domains corresponding to the DNA-recognition and cleavage functions, respectively. The recognition domain is made of three smaller subdomains (D1, D2 and D3) which are evolutionarily related to the helix-turn-helix-containing DNA-binding domain of the catabolite gene activator protein CAP. The CAP core has been extensively embellished in the first two subdomains, whereas in the third subdomain it has been co-opted for protein-protein interactions. Surprisingly, the cleavage domain contains only a single catalytic centre, raising the question of how monomeric FokI manages to cleave both DNA strands. Unexpectedly, the cleavage domain is sequestered in a 'piggyback' fashion by the recognition domain. The structure suggests a new mechanism for nuclease activation and provides a framework for the design of chimaeric enzymes with altered specificities. 26

EMBO Workshop, Cape Town, 2014What is the function of our protein?>proteinMKLSMKNNIINTQQSFVTMPNVIVPDIEKEIRRMENGACSSFSEDDDSASTSEESENENPHARGSFSYKSLRKGGPSQREQYLPGAIALFNVNNSSNKDQEPEEKKKKKKEKKSKSDDKNENKNDPEKKKKKKDKEKKKKEEKSKDKKEEEKKEVVVIDPSGNTYYNWLFCITLPVMYNWTMVIARACFDELQSDYLEYWLILDYVSDIVYLIDMFVRTRTGYLEQGLLVKEELKLINKYKSNLQFKLDVLSLIPTDLLYFKLGWNYPEIRLNRLLRFSRMFEFFQRTETRTNYPNIFRISNLVMYIVIIIHWNACVFYSISKAIGFGNDTWVYPDINDPEFGRLARKYVYSLYWSTLTLTTIGETPPPVRDSEYVFVVVDFLIGVLIFATIVGNIGSMISNMNAARAEFQARIDAIKQYMHFRNVSKDMEKRVIKWFDYLWTNKKTVDEKEVLKYLPDKLRAEIAINVHLDTLKKVRIFADCEAGLLVELVLKLQPQVYSPGDYICKKGDIGREMYIIKEGKLAVVADDGVTQFVVLSDGSYFGEISILNIKGSKAGNRRTANIKSIGYSDLFCLSKDDLMEALTEYPDAKTMLEEKGKQILMKDGLLDLNIANAGSDPKDLEEKVTRMEGSVDLLQTRFARILAEYESMQQKLKQRLTKVEKFLKPLIDTEFSSIEGPGAESGPIDSTEMBO Workshop, Cape Town, 2014>sp|P29973|CNGA1_HUMAN cGMP-gated cation channel alpha-1 OS=Homo sapiens GN=CNGA1 PE=1 SV=3MKLSMKNNIINTQQSFVTMPNVIVPDIEKEIRRMENGACSSFSEDDDSASTSEESENENPHARGSFSYKSLRKGGPSQREQYLPGAIALFNVNNSSNKDQEPEEKKKKKKEKKSKSDDKNENKNDPEKKKKKKDKEKKKKEEKSKDKKEEEKKEVVVIDPSGNTYYNWLFCITLPVMYNWTMVIARACFDELQSDYLEYWLILDYVSDIVYLIDMFVRTRTGYLEQGLLVKEELKLINKYKSNLQFKLDVLSLIPTDLLYFKLGWNYPEIRLNRLLRFSRMFEFFQRTETRTNYPNIFRISNLVMYIVIIIHWNACVFYSISKAIGFGNDTWVYPDINDPEFGRLARKYVYSLYWSTLTLTTIGETPPPVRDSEYVFVVVDFLIGVLIFATIVGNIGSMISNMNAARAEFQARIDAIKQYMHFRNVSKDMEKRVIKWFDYLWTNKKTVDEKEVLKYLPDKLRAEIAINVHLDTLKKVRIFADCEAGLLVELVLKLQPQVYSPGDYICKKGDIGREMYIIKEGKLAVVADDGVTQFVVLSDGSYFGEISILNIKGSKAGNRRTANIKSIGYSDLFCLSKDDLMEALTEYPDAKTMLEEKGKQILMKDGLLDLNIANAGSDPKDLEEKVTRMEGSVDLLQTRFARILAEYESMQQKLKQRLTKVEKFLKPLIDTEFSSIEGPGAESGPIDSTEMBO Workshop, Cape Town, 2014EMBO Workshop, Cape Town, 2014Exercise(Find the structural domains)

2DHH_AEMBO Workshop, Cape Town, 2014Presets -> Publication 1 (silhouette, rounded ribbons)Presets -> Interactive 1 (ribbons)1.Actions -> Ribbon -> show2.3.File -> Open 2DHH_A.pdb2.31

3ABZ-truncatedPA14EMBO Workshop, Cape Town, 2014EMBO Workshop, Cape Town, 2014Domain classification

EMBO Workshop, Cape Town, 2014

ClassClassFoldArchitectureSuperfamilyTopologyFamilySuperfamilyFamilyhomologyEMBO Workshop, Cape Town, 2014http://www.cathdb.info

EMBO Workshop, Cape Town, 2014EMBO Workshop, Cape Town, 2014http://www.cathdb.info

EMBO Workshop, Cape Town, 2014EMBO Workshop, Cape Town, 2014Protein family databasesUse model to searchsequence space for more membersAnnotateEMBO Workshop, Cape Town, 2014Building familiesBuild family profileBuild MSA ofrepresentative membersChoosetargetMANUAL CURATION40Profile Hidden Markov Models - Encapsulate diversityseq1ACG-LDSCG--ENCGGFDTCG-WQseq2seq3seq4EMBO Workshop, Cape Town, 2014Profile-HMM slides courtesy of Rob Finn (EMBL-EBI)Profile Hidden Markov Models - Encapsulate diversityM1seq1ACG-LDSCG--ENCGGFDTCG-WQseq2seq3seq4NTASEMBO Workshop, Cape Town, 2014Profile-HMM slides courtesy of Rob Finn (EMBL-EBI)Profile Hidden Markov Models - Encapsulate diversityM1seq1ACG-LDSCG--ENCGGFDTCG-WQseq2seq3seq4NTASM2CEMBO Workshop, Cape Town, 2014Profile-HMM slides courtesy of Rob Finn (EMBL-EBI)Profile Hidden Markov Models - Encapsulate diversityM1seq1ACG-LDSCG--ENCGGFDTCG-WQseq2seq3seq4NTASM2CGM3Profile-HMM slides courtesy of Rob Finn (EMBL-EBI)EMBO Workshop, Cape Town, 2014Profile Hidden Markov Models - Encapsulate diversityM1seq1ACG-LDSCG--ENCGgFDTCG-WQseq2seq3seq4NTASM2CGM3EMBO Workshop, Cape Town, 2014Profile-HMM slides courtesy of Rob Finn (EMBL-EBI)Profile Hidden Markov Models - Encapsulate diversityM1seq1ACG-LDSCG--ENCGgFDTCG-WQseq2seq3seq4NTASM2CGM3M4WFLYEMBO Workshop, Cape Town, 2014Profile-HMM slides courtesy of Rob Finn (EMBL-EBI)Profile Hidden Markov Models - Encapsulate diversityM1seq1ACG-LDSCG--ENCGgFDTCG-WQseq2seq3seq4NTASM2CGM3M4WFLYM5DEQEMBO Workshop, Cape Town, 2014Profile-HMM slides courtesy of Rob Finn (EMBL-EBI)Profile Hidden Markov Models - Encapsulate diversityM1seq1ACG-LDSCG--ENCGgFDTCG-WQseq2seq3seq4NTASM2CGM3M4WFLYM5DEQEMBO Workshop, Cape Town, 2014123-45Profile-HMM slides courtesy of Rob Finn (EMBL-EBI)Profile Hidden Markov Models - Encapsulate diversityM1seq1ACG-LDSCG--ENCGgFDTCG-WQseq2seq3seq4NTASM2CGM3M4WFLYM5DEQI3EMBO Workshop, Cape Town, 2014123-45Profile-HMM slides courtesy of Rob Finn (EMBL-EBI)Profile Hidden Markov Models - Encapsulate diversityM1seq1ACG-LDSCG--ENCGgFDTCG-WQseq2seq3seq4NTASM2CGM3M4WFLYM5DEQI3D4EMBO Workshop, Cape Town, 2014123-45Profile-HMM slides courtesy of Rob Finn (EMBL-EBI)Profile Hidden Markov Models - Encapsulate diversityM1seq1ACG-LDSCG--ENCGgFDTCG-WQseq2seq3seq4NTASM2CGM3M4WFLYM5DEQI3D4D5D3D2EMBO Workshop, Cape Town, 2014123-45D1Profile-HMM slides courtesy of Rob Finn (EMBL-EBI)Profile Hidden Markov Models - Encapsulate diversityM1seq1ACG-LDSCG--ENCGgFDTCG-WQseq2seq3seq4NTASM2CGM3M4WFLYM5DEQI3D4D5D3D2I4I2I1EMBO Workshop, Cape Town, 2014123-45D1Profile-HMM slides courtesy of Rob Finn (EMBL-EBI)I5I0Profile Hidden Markov Models - Encapsulate diversityM1seq1ACG-LDSCG--ENCGgFDTCG-WQseq2seq3seq4NTASM2CGM3M4WFLYM5DEQI3D4D5D3D2I4I2I1BEPlan7 core modelI0I5D1EMBO Workshop, Cape Town, 2014123-45HMMER3 (http://hmmer.janelia.org/) Profile-HMM slides courtesy of Rob Finn (EMBL-EBI)

M1M2M3M4M5BEMNI3D4D5D3D2I4I2I1I0I5D1INDNPEMBO Workshop, Cape Town, 2014ABHGECDFPEMBO Workshop, Cape Town, 2014ABHGECDFPEMBO Workshop, Cape Town, 2014ABHGECDF

Figure 6. Benchmark of search sensitivity and specificity.Eddy SR (2011) Accelerated Profile HMM Searches. PLoS Comput Biol 7(10): e1002195. doi:10.1371/journal.pcbi.1002195http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002195

Protein familiesIdentify functional residues

Capture diversity -> detect remote homology

Identify areas of the sequence space still devoid of any functional characterisationEMBO Workshop, Cape Town, 2014ADFEMBO Workshop, Cape Town, 2014BCEEMBO Workshop, Cape Town, 2014We can select interesting targets for experimentshttp://openi.nlm.nih.gov/detailedresult.php?img=2954196_f-66-01137-fig2&req=4

Structural Genomics

EMBO Workshop, Cape Town, 2014

EMBO Workshop, Cape Town, 2014Structural domains

~1000 domains

~4400 familiesFunctions and organismsEMBO Workshop, Cape Town, 2014Signalling, extracellular and chromatin-associated proteins Prokaryotes

Enzymes

~14000 families

>7000 families, >50000 subfamilies All proteinsEMBO Workshop, Cape Town, 2014

~2000 familiesFull lengthDomains

EMBO Workshop, Cape Town, 2014IntegrationCDDUses RPS-BLAST

EMBO Workshop, Cape Town, 2014IntegrationStructuraldomainsFunctional annotation of families/domainsProtein features(sites)Hidden Markov ModelsFinger printsProfilesPatterns

HAMAP

Member databasesEMBO Workshop, Cape Town, 2014Slide courtesy of Alex Mitchell (EMBL-EBI)69EMBO Workshop, Cape Town, 2014

70EMBO Workshop, Cape Town, 2014

71EMBO Workshop, Cape Town, 2014

72EMBO Workshop, Cape Town, 2014

73EMBO Workshop, Cape Town, 2014

74EMBO Workshop, Cape Town, 2014Orthologous families, trees

Pros:Better prediction of protein function (in principle, ortholog conjecture)Gene historySpecies trees

Caveats: Lateral gene transfer difficult to model/recognise -> bacteria difficultGene loss difficult to account for, may lead to wrong ortho-para assignmentLarge families difficult to model