Upload
usd-bioinformatics
View
201
Download
0
Embed Size (px)
Citation preview
Case study (ref)
PSI meta-server used
Computer drive drug design process
Step Input Tools Output / result
Selection of ligand compoundsequence
PubChem structure
structure Open Babel required file format
ADMET studies sequence Pre-ADMET Provide drug-likeliness, ADME profile and toxicity analysis for the ligand.
Receptor characterization
sequence GOR IV Text data result
Select Template sequence BLAST Identity, similarity, expectation value & alignment scores
Homology modeling
protein sequences Modeller pdb file
Visualizing pdb file PyMol visualized model
Model validation pdb file
PROCHECK Parameters(covalent bond distances and angles, stereochemicalvalidation and atom nomenclature)
ERRAT The overall quality factor of non-bonded interactions between different atoms types
DaliLite mean square deviation (RMSD) between the set of targets and template protein to check deviation of modelled protein from the template protein structure.
Docking studies PDB, PDBQ, PDBQT, SYBYL mol2 or PQR format
AutoDock 4.2 PDBQT format and include special keywords establishing the torsional flexibility
Data Preparation and Analysis
The world wide Protein Data Bank: The single archive of experimental marcomolecular structural data. [RCSB PDB] (USA); [PDBe] (Europe); [PDBj] (Japan)CATH: A manually curated hierarchical domain classification of protein structures in the Protein Data Bank.
UniProt: Protein Knowledgebase. A comprehensive, high-quality and freely accessible database of protein sequence and functional information.RefSeq: NCBI Reference Sequence. A collection of curated, non-redundant genomic DNA, transcript (RNA), and protein sequences produced by NCBI.
SBKB: Structural Biology Knowledgebase. A portal to protein structures, sequences, functions and methods.
Experimental Data
Acquisition
Amino Acid Sequences
Structure Template libraries
Structure Modelling
Template Search/Fold Recognition
BLAST/PSI-BLAST: Local alignment search tools.
HHpred: Server for homology detection and structure prediction by HMM-HMM comparison.
Homology Modeling
New prediction
yes
no
HHpred: Server for homology detection and structure prediction by HMM-HMM comparison.
I-Tasser: I-TASSER is a server for protein structure and function predictions. 3D models are built based on multiple-threading alignments by LOMETS and iterative TASSER assembly simulations.
M4T: Comparative Modelling using a combination of multiple templates and iterative optimization of alternative alignments.
ModWeb: A web server for automated comparative modeling that relies on PSI-BLAST, IMPALA and MODELLER.
SWISS-MODEL: Fully automated protein structure homology-modeling server accessible via the ExPASy web server, or from the program DeepView (Swiss Pdb-Viewer).
Homology Modeling
Structure Modeling
Swiss-model
identification of
structural template
alignment of target sequence
& template structure
model building
model quality
evaluation
InterPro Domain Scan:InterPro, Pfam, TIGRFAMs, PROSITE, SUPERFAMILY, ProDom,PRINTS, SMART, PROSITEPsiPred Secondary Structure Prediction:PSIPREDDISOPRED Disorder Prediction:DISOPREDMEMSAT:MEMSAT
using BLAST query against the ExPDB template library extracted from PDB.When no suitable templates are identified, using Iterative Profile Blast. Which is the template library is searched with PSI-BLAST (Altschul et al.) using an iteratively generated sequence profile based on NR (Wheeler et al.).HHSearch: To detect distantly related template structures (Söding et al.) Display of template identification results
DeepView:http://www.expasy.org/spdbv/
Protein Structure & Model Assessment Tools: ANOLEAQMEAN, The global QMEAN4 scoring function ( Benkert et al. 2008), The global QMEAN6 scoring function (Benkert et al. 2008), The local version of the QMEAN scoring function (Benkert et al. 2009),DFIRE GROMOSWhat CheckPROCHECKPROMOTIFDSSP
QMEAN4 global scoresLocal Model Quality Estimation: Anolea / QMEAN / Gromos:Alignment Modelling Log Template Selection Log Quaternary Structure Modeling LogLigand Modeling Log
Outputs:
Structure Modelling
The "automated mode" is suited for cases where the target-template similarity is sufficiently high to allow for fully automated modeling.
This submission requires only the amino acid sequence or the UniProt accession code of the target protein as input data.
Depending on the planned model application, it can be necessary to select a different structural template than the one ranked first in the automated process. Please make sure that this file contains only a single protein chain, and does not contain chemically modified amino acids, hereto atoms, ligands, etc.
Automated Mode
If the three-dimensional structure is known for at least one of the members, this alignment can be used as starting point for comparative modelling using the "alignment mode".
The "alignment mode" allows the user to test several alternative alignments and evaluate the quality of the resulting models in order to achieve an optimal result.
1. Prepare a multiple sequence alignment.2. Submit your alignment to the Workspace Alignment Mode.3. Select Target and Template.4. Check Alignment and Submit.
The server pipeline will build the model purely based on this alignment. During the modeling process, implemented as rigid fragment assembly in the SWISS-MODEL (Schwede et al.) pipeline, the modeling engine might introduce minor heuristic modifications to the placement of insertions and deletions.
Alignment Mode
In difficult modeling situations, where the correct alignment between target and template cannot be clearly determined by sequence based methods, visual inspection and manual manipulation of the alignment can significantly help improving the quality of the resulting model.
Project files contain the superposed template structures, and the alignment between the target and template. Project files can be generated inside the program DeepView (Swiss-PdbViewer Guex et al.), by the workspace template selection tools, and are also the default output format of the modeling pipeline. This allows analyzing and iteratively improving the models generated by the "Automated mode" and "Alignment mode" modeling approaches.
Project Mode
M4T
Output: pdb filepir fileana fileEnergy profile
Structure Modelling
I-TASSER
tries retrieve template proteins
of similar folds from the PDB library
by LOMETS
Structure assembly continuous fragments
by replica-exchange Monte Carlo simulations
with the threading unaligned regions
(mainly loops) built by ab initio modeling
low free-energy states are identified by SPICKER
Structure Re-assembly
Retrieve from cluster LOMETS & TM-align
lowest energy
structures are
selected.
final full-atomic models
(Remo H-Bond optimization)
function predictions
TM-alignsearch
TM-score
Outputs:Predicted Secondary StructurePredicted Solvent Accessibilitypdb fileTop 10 templatesProteins with highly similar structure in PDB Function PredictionPredicted GO termsPredicted Binding Site
Structure Modellingsubmit an amino acid sequence
HHpred
Select input format
MSA Generation Method
More Options
Max. MSA Generation iterations
Score secondary structure
Realign with MAC algorithm
Alignment mode
Select HMM databases
Entering a single query sequenceEntering a multiple alignment
ProteomesPdb70,Scop70,CDD,InterPro,PfamA,SMART,
PANTHER, TIGRFAMs, PIRSF, SUPERFAMILY, CATH/Gene3D,
COG/KOG, PfamB
HBlits : Download pdf filePSI-Blast : View Article
E-value threshold for MSA Generation
Min. coverage of MSA hitsMin. sequence identity of MSA hits
with query MAC realignment threshold
(0.0:global, >=0.1:local) Compositional bias correction
Show sequences per HMM Width of alignments
Min. probability in hit listMax. number of hits in hit list
Output: pdb file
Structure Modelling
Modeler
Searching for structures related to TvLDH
Selecting a template
Model evaluation Model building
Aligning TvLDF with the template
target TvLDH sequence
profile.build()
automodel Output: pdb file
Structure Modeling
New predictionWhen no suitable template structure can be identified, de novo (a.k.a. ab initio) structure prediction methods can be used to generate three-dimensional protein models without relying on a homologus template structure:
Robetta: Full-chain protein structure prediction server based on the Rosetta method.Rosetta: De novo protein structure prediction software.
Structure Modeling
Hybrid techniquesThe goal of hybrid techniques is to contribute to a comprehensive structural characterization of biomolecules ranging in size and complexity from small peptides to large macromolecular assemblies. Detailed structural characterization of assemblies is generally impossible by any single existing experimental or computational method. This barrier can be overcome by hybrid approaches that integrate data from diverse biochemical and biophysical experiments:CS-ROSETTA: System for chemical shifts based protein structure prediction using ROSETTA.IMP: software for a comprehensive structural characterization of biomolecules.
Structure Modelling
Confidence EstimationValidation and Quality estimation
Example: Verify3D
Tools Input Output
Verify 3D pdb 3D-1D average score, raw data, raw average data,
Whatcheck
pdb Detial text report, TeX file
Prove pdb An pdf file with Z–score and analysis of residues, and an text file
Errat pdb An pdf file with overall quality factor, error value in each residue
PROCHECK
pdb comprise a number of plots, in PostScript format
MolProbity
pdb can view in pdf or KiNG, can choose lots of output
ProSA pdb Z-Score, knowledge based energy, sequence position
Confidence Estimation
Application
Structure Visualization & Analysis
PyMol: A Python based open-source viewer for visualization of macromolecular structures.
AutoDock: A suite of automated docking tools.
Molecular Interactions
Molecular Motions DynDom: Protein Domain Motion Analysis.molmovdb.org: Gallery of morphs.molmovdb.org: Molecular Movements Database.