Upload
melvyn-hicks
View
219
Download
1
Tags:
Embed Size (px)
Citation preview
PISAProtein Interfaces, Surfaces and Assemblies
http://www.ebi.ac.uk/msd-srv/prot_int/pistart.html
Eugene [email protected]
CCP4 & EBI-MSD
PISA is a tool for the assessment of macromolecular interactions using data provided by protein crystallography.
• identification and prediction of multimeric states• analysis of structure-function relationship• analysis and prediction of macromolecular interactions• analysis of macromolecular complexation and crystallisation• properties of macromolecular interfaces• search for interface/structure/assembly homologues• active site recognition and analysis• macromolecular surface analysis• other
Scope of tasks addressed by PISA:
Project started in 2004, supported by BBSRC research grant 721/B19544
PISA today
Web-service hosted by EBI-MSD at
• provides PISA analysis for all PDB entries and database searches
• allows upload of PDB and mmCIF files for interactive PISA analysis
• provides XML download of multimer data, which is used in server applications (BALBES) on molecular replacement
• works on aminoacid, nucleic acid and ligand structures
• more than 140,000 external queries served since the release
• more than 1700 users
• has a command-prompt stand-alone version
http://www.ebi.ac.uk/msd-srv/prot_int/pistart.html
PISA basics
PISA is based on chemical thermodynamics:
0int STGGdiss
for stable structures in the standard state.
Gdiss cannot be calculated exactly. PISA uses semiempirical
models with parameters calibrated to available experimental data on multimeric states.
Precision of free energy estimates in PISA: ±5 kcal/mol
Success rate of PQS prediction: 80-90%
Last year activity – nucleic acids and ligands
Extension to include protein-DNA/RNA and ligand interactions
• Derivation and calibration of interaction parameters• Database of ligand interactions (~6000 entries
parameterized on atomic level)• Tools for database update and semi-automatic calculation
of protein-ligand interactions
Core algorithm completely rewritten in order to:
• implement changes needed to adopt protein-DNA/RNA and ligand interactions
• optimize and speed-up the calculations
Last year activity – ligand control
Control over ligand processing:
• Possibility to exclude certain ligands from processing
• Choice of ligand processing modes:
AutomaticFix all ligandsFree all ligands
Last year activity – adaptation for MSD&PDB
Interface and presentation improvements at request of PDB/MSD curation teams:
• Consistent identification of symops in PISA pages• Adoption of PDB@RCSB symop nomenclature• Automatic generation of REMARK 350• Optimization of final assembly positions• Reporting on redundant assemblies (especially when ASU
contains a fractional number of assemblies >1)
PISA is now employed by both MSD and PDB@RCSB as a mandatory processing tool for all depositions
Last year activity - PISA database
PISA database searches by
• Multimeric state• Symmetry number• Space group• Homomeric type• Salt bridges• Disulphide bonds• List of ligands• List of keywords• Dissociation energy• Assembly ASA• Assembly BSA• Percent BSA
Last year activity - standalone PISA
Command-prompt, stand-alone PISA for inclusion into CCP4
• Contains only data-processing part of “big” PISA, i.e. no database
• For technical reasons, there are code differences from “big” PISA
• Functionally identical to the corresponding parts of “big” PISA
• Mimics web-page output of “big” PISA in plain text
• Provides same XML output as “big” PISA
• Works as a local server: Maintains sessions Data processing separated from data retrieval
• Visualization using Rasmol
Standalone PISA example
Last year activity – tune-up and polishing
Last percent of improvement takes 99% of all efforts
Literally hundreds of small problems solved on everyday basis. Examples:
• Inference of correct orthogonalisation codes
• Choice of margins for identification of parallel monomeric units
• Symmetry number calculations: superposition margins and unit enumeration order
• Identification and proper treatment of overlapping symmetry mates with fractional occupancy
• Unique labels for the download/visualisation data and wait pages to avoid caching on remote servers
• Catching up with EBI systems update
• PISA is roughly 60,000 C++ statements and small bugs are most probably still there
6 releases over year
Future plans
PISA is systematically underperforming on FABs. Possible reasons:
• Neglect of electrostatic interactions• Neglect of entropy absorbance in flexible complexes
Last percent of improvement takes 99% of efforts
Both are very difficult problems to address
Future plans
Analysis of “custom” assemblies• Allow for input without crystallographic data• Effectively inclusion of NMR entries as well
Assessment of crystal “quality”
Detection of “custom” assemblies• Allow for report on specific assemblies otherwise missed as unstable
Automatic prediction of macromolecular interactions and assemblies by homologue search in PISA database
• Identification of fake PDB entries and depositions
Fake PDBs
2i07 2icc 2ice 2icf 2hr0BSA 20% 51% 24% 24% 10%
Interfaces per chain 9.56 7 6.12 8.33 3.5
Min. interfaces / chain 8 7 4 4 2
Connected crystal Yes Yes Yes Yes Yes