14
PISA Protein Interfaces, Surfaces and Assemblies http://www.ebi.ac.uk/msd-srv/prot_int/pistart.html Eugene Krissinel [email protected] CCP4 & EBI-MSD

PISA Protein Interfaces, Surfaces and Assemblies Eugene Krissinel [email protected] CCP4 & EBI-MSD

Embed Size (px)

Citation preview

Page 1: PISA Protein Interfaces, Surfaces and Assemblies  Eugene Krissinel keb@ebi.ac.uk CCP4 & EBI-MSD

PISAProtein Interfaces, Surfaces and Assemblies

http://www.ebi.ac.uk/msd-srv/prot_int/pistart.html

Eugene [email protected]

CCP4 & EBI-MSD

Page 2: PISA Protein Interfaces, Surfaces and Assemblies  Eugene Krissinel keb@ebi.ac.uk CCP4 & EBI-MSD

PISA is a tool for the assessment of macromolecular interactions using data provided by protein crystallography.

• identification and prediction of multimeric states• analysis of structure-function relationship• analysis and prediction of macromolecular interactions• analysis of macromolecular complexation and crystallisation• properties of macromolecular interfaces• search for interface/structure/assembly homologues• active site recognition and analysis• macromolecular surface analysis• other

Scope of tasks addressed by PISA:

Project started in 2004, supported by BBSRC research grant 721/B19544

Page 3: PISA Protein Interfaces, Surfaces and Assemblies  Eugene Krissinel keb@ebi.ac.uk CCP4 & EBI-MSD

PISA today

Web-service hosted by EBI-MSD at

• provides PISA analysis for all PDB entries and database searches

• allows upload of PDB and mmCIF files for interactive PISA analysis

• provides XML download of multimer data, which is used in server applications (BALBES) on molecular replacement

• works on aminoacid, nucleic acid and ligand structures

• more than 140,000 external queries served since the release

• more than 1700 users

• has a command-prompt stand-alone version

http://www.ebi.ac.uk/msd-srv/prot_int/pistart.html

Page 4: PISA Protein Interfaces, Surfaces and Assemblies  Eugene Krissinel keb@ebi.ac.uk CCP4 & EBI-MSD

PISA basics

PISA is based on chemical thermodynamics:

0int STGGdiss

for stable structures in the standard state.

Gdiss cannot be calculated exactly. PISA uses semiempirical

models with parameters calibrated to available experimental data on multimeric states.

Precision of free energy estimates in PISA: ±5 kcal/mol

Success rate of PQS prediction: 80-90%

Page 5: PISA Protein Interfaces, Surfaces and Assemblies  Eugene Krissinel keb@ebi.ac.uk CCP4 & EBI-MSD

Last year activity – nucleic acids and ligands

Extension to include protein-DNA/RNA and ligand interactions

• Derivation and calibration of interaction parameters• Database of ligand interactions (~6000 entries

parameterized on atomic level)• Tools for database update and semi-automatic calculation

of protein-ligand interactions

Core algorithm completely rewritten in order to:

• implement changes needed to adopt protein-DNA/RNA and ligand interactions

• optimize and speed-up the calculations

Page 6: PISA Protein Interfaces, Surfaces and Assemblies  Eugene Krissinel keb@ebi.ac.uk CCP4 & EBI-MSD

Last year activity – ligand control

Control over ligand processing:

• Possibility to exclude certain ligands from processing

• Choice of ligand processing modes:

AutomaticFix all ligandsFree all ligands

Page 7: PISA Protein Interfaces, Surfaces and Assemblies  Eugene Krissinel keb@ebi.ac.uk CCP4 & EBI-MSD

Last year activity – adaptation for MSD&PDB

Interface and presentation improvements at request of PDB/MSD curation teams:

• Consistent identification of symops in PISA pages• Adoption of PDB@RCSB symop nomenclature• Automatic generation of REMARK 350• Optimization of final assembly positions• Reporting on redundant assemblies (especially when ASU

contains a fractional number of assemblies >1)

PISA is now employed by both MSD and PDB@RCSB as a mandatory processing tool for all depositions

Page 8: PISA Protein Interfaces, Surfaces and Assemblies  Eugene Krissinel keb@ebi.ac.uk CCP4 & EBI-MSD

Last year activity - PISA database

PISA database searches by

• Multimeric state• Symmetry number• Space group• Homomeric type• Salt bridges• Disulphide bonds• List of ligands• List of keywords• Dissociation energy• Assembly ASA• Assembly BSA• Percent BSA

Page 9: PISA Protein Interfaces, Surfaces and Assemblies  Eugene Krissinel keb@ebi.ac.uk CCP4 & EBI-MSD

Last year activity - standalone PISA

Command-prompt, stand-alone PISA for inclusion into CCP4

• Contains only data-processing part of “big” PISA, i.e. no database

• For technical reasons, there are code differences from “big” PISA

• Functionally identical to the corresponding parts of “big” PISA

• Mimics web-page output of “big” PISA in plain text

• Provides same XML output as “big” PISA

• Works as a local server: Maintains sessions Data processing separated from data retrieval

• Visualization using Rasmol

Page 10: PISA Protein Interfaces, Surfaces and Assemblies  Eugene Krissinel keb@ebi.ac.uk CCP4 & EBI-MSD

Standalone PISA example

Page 11: PISA Protein Interfaces, Surfaces and Assemblies  Eugene Krissinel keb@ebi.ac.uk CCP4 & EBI-MSD

Last year activity – tune-up and polishing

Last percent of improvement takes 99% of all efforts

Literally hundreds of small problems solved on everyday basis. Examples:

• Inference of correct orthogonalisation codes

• Choice of margins for identification of parallel monomeric units

• Symmetry number calculations: superposition margins and unit enumeration order

• Identification and proper treatment of overlapping symmetry mates with fractional occupancy

• Unique labels for the download/visualisation data and wait pages to avoid caching on remote servers

• Catching up with EBI systems update

• PISA is roughly 60,000 C++ statements and small bugs are most probably still there

6 releases over year

Page 12: PISA Protein Interfaces, Surfaces and Assemblies  Eugene Krissinel keb@ebi.ac.uk CCP4 & EBI-MSD

Future plans

PISA is systematically underperforming on FABs. Possible reasons:

• Neglect of electrostatic interactions• Neglect of entropy absorbance in flexible complexes

Last percent of improvement takes 99% of efforts

Both are very difficult problems to address

Page 13: PISA Protein Interfaces, Surfaces and Assemblies  Eugene Krissinel keb@ebi.ac.uk CCP4 & EBI-MSD

Future plans

Analysis of “custom” assemblies• Allow for input without crystallographic data• Effectively inclusion of NMR entries as well

Assessment of crystal “quality”

Detection of “custom” assemblies• Allow for report on specific assemblies otherwise missed as unstable

Automatic prediction of macromolecular interactions and assemblies by homologue search in PISA database

• Identification of fake PDB entries and depositions

Page 14: PISA Protein Interfaces, Surfaces and Assemblies  Eugene Krissinel keb@ebi.ac.uk CCP4 & EBI-MSD

Fake PDBs

2i07 2icc 2ice 2icf 2hr0BSA 20% 51% 24% 24% 10%

Interfaces per chain 9.56 7 6.12 8.33 3.5

Min. interfaces / chain 8 7 4 4 2

Connected crystal Yes Yes Yes Yes Yes