Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools for Bioinformatics Tools for Structural BiologStructural Biology
Dr Jaime PriluskyISPC-WISVienna, June 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Choose targetChoose target
ExpressionExpression
AbandonAbandon
SolubleSoluble RefoldRefold
PurifyPurify
Evaluate purity and function
Evaluate purity and function
Structure Determination
Structure Determination
AbandonAbandon
Clone geneClone geneModify geneModify geneOther expression
systemsOther expression
systems
AbandonAbandonCrystallizationCrystallization
-
+
+
-+
-
-
+
+
--NMR NMR
-
Bioinformatics Tools - Dr Jaime Prilusky - 2006
besides many intelligent approaches …
… targeted mutagenesis of surface patches containing, residues with large flexible side chains and their replacement with, smaller amino acids lead to effective preparation of X-ray quality, crystals of proteins otherwise recalcitrant to crystallization …
Derewenda ZS, Rational protein crystallization by mutational surface engineering. Structure. 2004 Apr;12(4):529-35.
Bioinformatics Tools - Dr Jaime Prilusky - 2006
the overall gene2structure process still is …
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Choose targetChoose target
ExpressionExpression
AbandonAbandon
SolubleSoluble RefoldRefold
PurifyPurify
Evaluate purity and function
Evaluate purity and function
Structure Determination
Structure Determination
AbandonAbandon
Clone geneClone geneModify geneModify geneOther expression
systemsOther expression
systems
AbandonAbandonCrystallizationCrystallization
-
+
+
-+
-
-
+
+
--NMR NMR
-
Bioinformatics Tools - Dr Jaime Prilusky - 2006
http://www.weizmann.ac.il/ISPC/biotools.html
• Target Identity and Foldability• Selection of Expression System• Selection of Crystallization Conditions• Data Awarness• Data Management
Bioinformatics Tools - Dr Jaime Prilusky - 2006
http://www.weizmann.ac.il/ISPC/biotools.html
• Target Identity and Foldability• Selection of Expression System• Selection of Crystallization Conditions• Data Awarness• Data Management
Bioinformatics Tools - Dr Jaime Prilusky - 2006
OCA
OCA© facilitates the understanding of the genomics/proteomics biological data through data analysis and synthesis. The data integration process brings together dispersed pieces of information, generating comprehensive summaries that might be just what a researcher needs for crystallizing an idea or understanding a problem.
http://bip.weizmann.ac.il/oca
Web Services Enabled
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
SeqFacts©
SeqFacts© is a tool for sequence identification, analysis, characterization and annotation. This server will try to find relevant information related to your sequence.
http://bip.weizmann.ac.il/sqfbin/seqfacts
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
sequence
properties
references
DB similarity
RecentReferences
SeqAlert(once)
FoldIndex
SeqFacts
RecentReferences
Amino AcidDNA
similarity searchspecific DB search
quality cleaningvectors removal
calculationcalculationORF
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Automatic analysis of 1226_int11f-SP6-19 (Sun Jun 8 15:55:24 2003)-------------------------------------------------------------------------------Summary:This analysis is based on a clean portion (4 to 644) from the 920 original bases. The best ORF of the six frames translation has a length of 186.A similarity search found 50 hits on 18 organisms: Oryctolagus cuniculus and Homo sapiens are the most representatives. Conserved domains present in the translated sequence: Arylesterase, COG3386.
Original files:1226_int11f-SP6-19.fasta1226_int11f-SP6-19.gcgORFs Six Frames translationSimilarity SearchConserved Domain databaseGenome DrosophilaGenome HomoGenome MusGenome RattusFantom (Functional Annotation Of Mouse)Taxonomy
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Automatic analysis of 1226_int11f-SP6-19 (Sun Jun 8 15:55:24 2003)-------------------------------------------------------------------------------Summary:This analysis is based on a clean portion (4 to 644) from the 920 original bases. The best ORF of the six frames translation has a length of 186.A similarity search found 50 hits on 18 organisms: Oryctolagus cuniculus and Homo sapiens are the most representatives. Conserved domains present in the translated sequence: Arylesterase, COG3386.
Original files:1226_int11f-SP6-19.fasta1226_int11f-SP6-19.gcgORFs Six Frames translationSimilarity SearchConserved Domain databaseGenome DrosophilaGenome HomoGenome MusGenome RattusFantom (Functional Annotation Of Mouse)Taxonomy
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Automatic analysis of 1226_int11f-SP6-19 (Sun Jun 8 15:55:24 2003)-------------------------------------------------------------------------------Summary:This analysis is based on a clean portion (4 to 644) from the 920 original bases. The best ORF of the six frames translation has a length of 186.A similarity search found 50 hits on 18 organisms: Oryctolagus cuniculus and Homo sapiens are the most representatives. Conserved domains present in the translated sequence: Arylesterase, COG3386.
Original files:1226_int11f-SP6-19.fasta1226_int11f-SP6-19.gcgORFs Six Frames translationSimilarity SearchConserved Domain databaseGenome DrosophilaGenome HomoGenome MusGenome RattusFantom (Functional Annotation Of Mouse)Taxonomy
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Automatic analysis of 1226_int11f-SP6-19 (Sun Jun 8 15:55:24 2003)-------------------------------------------------------------------------------Summary:This analysis is based on a clean portion (4 to 644) from the 920 original bases. The best ORF of the six frames translation has a length of 186.A similarity search found 50 hits on 18 organisms: Oryctolagus cuniculus and Homo sapiens are the most representatives. Conserved domains present in the translated sequence: Arylesterase, COG3386.
Original files:1226_int11f-SP6-19.fasta1226_int11f-SP6-19.gcgORFs Six Frames translationSimilarity SearchConserved Domain databaseGenome DrosophilaGenome HomoGenome MusGenome RattusFantom (Functional Annotation Of Mouse)Taxonomy
Bioinformatics Tools - Dr Jaime Prilusky - 2006
FoldIndex
FoldIndex© tries to answer to the question: Will this protein fold?
FoldIndex© predicts whether a given protein is intrinsically disordered.Prilusky J., Felder C.E., Zeev-Ben-Mordehai T., Rydberg E., Man O., Beckmann J.S., Silman I. and Sussman J.L. Bioinformatics, 2005 Aug 15;21(16):3435-8
http://bip.weizmann.ac.il/fold
Web Services Enabled
Bioinformatics Tools - Dr Jaime Prilusky - 2006
T.Zeev-Ben-Mordehai, EH.Rydberg, A.Solomon, L.Toker, VJ.Auld, I. Silman, S. Botti and JL. Sussman: The Intracellular Domain of the Drosophila CholinesteraseLike Neural Adhesion Protein, Gliotactin, is Natively Unfolded. PROTEINS: Structure, Function, and Genetics
53:758–767 (2003)
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Web Services’ access to FoldIndex:
http://bip.weizmann.ac.il/fldbin/findex?sq=SEQ&m=xml
sq is the protein sequence, one character, no spaces.
m (mode) is either xml or efam, the format FoldIndex will return the result on. With the mode set to ‘xml’, the server sends results in XML format. When mode is equal to ‘efam’, the results are sent in eFamily format.
FoldIndex home page has sample Perl scripts to retrieve and parse prediction data.
Web Services Enabled
Bioinformatics Tools - Dr Jaime Prilusky - 2006
BestPrimers
BestPrimers© provides a simple interface for primers calculation, with FoldIndex© support.
http://bip.weizmann.ac.il/sqfbin/bestPrimers
Web Services Enabled
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
MultiPrimers
MultiPrimers© provides a simple interface for multiple primers calculation, with domain, enzyme and universal primers support.
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
VerifyCloning©
VerifyCloning© compares your original sequence against the result from cloning procedures, highlighting conflicting areas. VerifyCloning has an automatic reverse/complement mode that will reorder your sequences if required.
http://bip.weizmann.ac.il/vfclon
Web Services Enabled
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
http://www.weizmann.ac.il/ISPC/biotools.html
• Target Identity and Foldability• Selection of Expression System• Selection of Crystallization Conditions• Data Awarness• Data Management
Bioinformatics Tools - Dr Jaime Prilusky - 2006
SuggestES©
SuggestES© takes the protein sequence you provide and scans a large database with protein sequences with known results for different expression systems, generating a suggestion based on several parameters:
Similarity: how similar is your sequence to the existing data in the database? Recentness: how recently was a given expression system used? Frequency: how frequently was a given expression system used?
http://bip.weizmann.ac.il/suggestES
Web Services Enabled
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
http://www.weizmann.ac.il/ISPC/biotools.html
• Target Identity and Foldability• Selection of Expression System• Selection of Crystallization Conditions• Data Awarness• Data Management
Bioinformatics Tools - Dr Jaime Prilusky - 2006
SuggestXC©
SuggestXC© takes the protein sequence you provide and scans a large database with protein sequences with known results for different crystallization conditions, generating a suggestion based on several parameters:
Similarity: how similar is your sequence to the existing data in the database? Recentness: how recently was a given crystallization condition used? Frequency: how frequently was a given crystallization condition used?
http://bip.weizmann.ac.il/suggestXC
Web Services Enabled
Bioinformatics Tools - Dr Jaime Prilusky - 2006
http://www.weizmann.ac.il/ISPC/biotools.html
• Target Identity and Foldability• Selection of Expression System• Selection of Crystallization Conditions• Data Awarness• Data Management
Bioinformatics Tools - Dr Jaime Prilusky - 2006
SeqAlert
SeqAlert© is a sequence alerting service that will periodically compare your sequence(s) against sequences from determined 3D structures, or structures being determined at PDB, and TargetDB, the database of target sequences from worldwide structural genomics projects.
It also reports the Pubmed IDs of papers that might be related to your sequence, published on the last 20 days.
http://bip.weizmann.ac.il/seqalert
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
http://www.weizmann.ac.il/ISPC/biotools.html
• Target Identity and Foldability• Selection of Expression System• Selection of Crystallization Conditions• Data Awarness• Data Management
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Proteomics LIMS
An integrated Laboratory Information Management Systems for Proteomics with Complexes support.
Web Services Enabled
http://www.weizmann.ac.il/ISPC/
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
LabTargets (browser/database for proteomics)
supported operating systems
required software
client-server architecture
generates/exports
• SPINE xml• TargetDB xml• Targets Status Graph• Targets Detail• Structures Gallery
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Bioinformatics Tools - Dr Jaime Prilusky - 2006
• OCA (browser/database for structure & function)• SeqFacts (tell me something about this sequence)• FoldIndex (will this protein fold?)• BestPrimers (just best primers)• MultiPrimers (domain’s aware)• VerifyCloning (check results from DNA sequencing)• SuggestES (suggest an expression system)• SuggestXC (suggest crystallization conditions)• RecentReferences (what people published?)• SeqAlert (someone else works on this sequence?)• LIMS (including support for complexes)• LabTargets (browser/database for proteomics)
Some of our tools:
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Direct Web Access
Web Services, Bulk Access
Web Services, Software Integration
How can you access our tools?
Bioinformatics Tools - Dr Jaime Prilusky - 2006
OCA mirrors world distribution
Direct Web Access
http://bip.weizmann.ac.il/oca
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Web Services, Bulk Access
Beer Sheva, Israel Cambridge, USAChicago, USAHawaii, USAHelsinki, Finland
Leipzig, GermanyMissouri, USAMontpellier, FranceSan Diego, USA
Bangalore, IndiaKansas City, USAMadrid, Spain
FoldIndex
LPC CSU
OCA
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Web Services, Software IntegrationIBS: Integrated Bioinformatics System@ KISTI, Daejeon, South Korea[FoldIndex, BestPrimers, SeqFacts, SuggestES]
GUTSS: Genome Structure Selection System@ The Burham Institute, La Jolla CA, USA[SeqFacts]
Raptor 3D: functional annotations to structures@ GBF, Braunschweig, Germany[OCA]
WIS LIMS:@ Weizmann Institute, Rehovot, Israel[OCA, FoldIndex, SeqFacts, BestPrimers, …]
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Some of our tools allow for alocal installation
Why to have a local installation?
• intranet requirements• repetitive batch analysis• tight services integration
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Software and languages used …
• Apache, the Apache Software Foundation.• Bundle::DataMint, © 2003, Jaime Prilusky, WIS, modules for Data
Mining and Data Integration.• mysql, © 1995-2003 MySQL AB.• flip, © 1997 OGMP (Organelle Genome Megasequencing Project)• OCA, © 2000-2005, Jaime Prilusky, WIS• Perl, © 1987-2001, Larry Wall.• phred, © 1993-2002 by Phil Green and Brent Ewing.• primer3, © 1996,1997,1998 Whitehead Institute for Biomedical
Research.• JMOL, Java molecular viewer for three-dimensional chemical
structures.
Bioinformatics Tools - Dr Jaime Prilusky - 2006
Dr. Shira AlbeckAnna BranzburgRani BravdoProf. Yigal BursteinDr. Orly DymYossi JacobovitchNurit LevyRan Meged
Yigal MichaelDr. Yoav Peleg Dr. Jaime PriluskyProf. Gideon SchreiberProf. Israel SilmanProf. Joel L SussmanDr. Tamar Unger
Israel Structural Proteomics Center
Bioinformatics Tools - Dr Jaime Prilusky - 2006