22
Increasing the Value of Crystallographic Databases Derived knowledge bases Knowledge-based applications programs Data mining tools for protein- ligand complexes

Increasing the Value of Crystallographic Databases Derived knowledge bases Knowledge-based applications programs Data mining tools for protein-ligand complexes

Embed Size (px)

Citation preview

Increasing the Value ofCrystallographic Databases

• Derived knowledge bases

• Knowledge-based applications programs

• Data mining tools for protein-ligand complexes

Mogul

• Knowledge base of molecular geometry information taken from CSD

• Bond length, valence angle and torsion angle distributions

• Aim: click on a molecular parameter of interest and get observed distribution with no intervening steps

Mogul - Search Setup

User loads amolecule thenspecifies a bond length,bond angle ortorsion angle,of interest

Mogul - Results

Substructure

Mogul - Search Algorithm

Substructures stored in a hierarchical tree:

B C

A D

Properties of B,C

Properties of A-B & C-D bonds

Properties of atoms bound to B and C

Mogul - Getting More Hits

Allow certainatoms to bemore general

Generificationrules

Mogul - Generic Search Results

Substructuressorted by 2Dsimilarity withoriginal query

IsoStar and SuperStar

• IsoStar - knowledge base of information about intermolecular interactions

• SuperStar - program for predicting binding points in an enzyme active site

• SuperStar predictions based solely on IsoStar data

IsoStar Scatterplots

CSD vs. PDB scatterplots

0

5

10

15

20

25

30

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Carbo Index

Nr.

of

de

nsity

plo

ts Any aromatic CH

Any aliphatic CH

Water

Any C=O

Any OH

Any NH

Similarity index distribution for 72 comparisons

IsoStar Density Surfaces

Scaling of IsoStar Surfaces

• Densities of grid point i are converted to propensities by:

• Average density is the density of contacts expected by random chance:

n

icelli

contacti

centrali

avV

nnd

1

av

ii d

dp

SuperStar

• Calculate binding positions for specific probe atoms in protein active sites

• Identify functional groups in binding-site • Look up relevant IsoStar scatterplots and

overlay on functional groups• Contour - combining by taking products

+ =

SuperStar - Example Map

OH

SuperStar Features

• Cavity detection

• Surface or pharmacophore point display

• Metal coordination

• Hyperlinking to IsoStar scatterplots

• Choice of CSD- or PDB-based maps

• Gaussian fits

SuperStar Validation

• 265 PDB complexes

• Generate four maps (Me, C=O, NH, OH)

• See whether maps discriminate correctly, e.g. does Me have highest propensity where a ligand Me group is observed?

• Compute percentage success rate

• CSD 74%

• PDB 75%

• Gaussian CSD 70 - 74%

• PDB maps fuzzier, fewer probes possible

• Gaussian 4-5 times faster

Relibase+• Protein-ligand database system

• Based on original software developed by Manfred Hendlich and colleagues at Merck and Marburg University

• Enables searching of PDB and of in-house proprietary databases

Some Relibase+ Options

• Text searching

• Sequence searching

• 2D substructure and similarity searching

• 3D substructure searching

• Logical combination of hit lists

• Searching for intermolecular interactions

• Auto-superposition of similar binding sites

• Scripting facility based on Python

Analysis of 3D Queries

Distance Distribution

Torsion Distribution

Benzamidine-CarboxylateInteractions

Binding Site Superposition

Example Python Script# Find all benzamidines # and check contacts to ASP under 3Å

relibase.load(’dbase1') ba = relibase.Hitlist({'smiles':'c1ccccc1C(=N)N'}) new = relibase.Hitlist() for ligand in ba: for chain in ligand.contacts(): for residue in chain.residues(): if residue.name() == 'ASP': ligatoms = ligand.atoms() resatoms = residue.atoms() d = mindist(ligatoms,resatoms) if d < 3.0: new.append(ligand) new.saveas(’contact')

Acknowledgements

• Manfred Hendlich

• Gerhard Klebe

• Ingo Dramburg

• Andreas Bergner

• Ian Bruno

• Jason Cole

• Paul Edgington

• Magnus Kessler

• Jie Luo

• Clare Macrae

• Patrick McCabe

• Willem Nissink

• Jon Pearson

• Scott Rowland

• Barry Smith

• Marcel Verdonk