14
Bulk Model Construction and Molecular Replacement in CCP4 Automation Ronan Keegan, Norman Stein, Martyn Winn.

Bulk Model Construction and Molecular Replacement in CCP4 Automation Ronan Keegan, Norman Stein, Martyn Winn

Embed Size (px)

Citation preview

Page 1: Bulk Model Construction and Molecular Replacement in CCP4 Automation Ronan Keegan, Norman Stein, Martyn Winn

Bulk Model Construction and Molecular Replacement in CCP4

Automation

Ronan Keegan, Norman Stein, Martyn Winn.

Page 2: Bulk Model Construction and Molecular Replacement in CCP4 Automation Ronan Keegan, Norman Stein, Martyn Winn

Overview• Brute force search method for the best model for

Molecular Replacement on a target structure.• Python script utilising HPC resources.• Can also run on single machine. • Two main parts:

– Model Generation using a variety of methods.– Feeding a selection of the best models into an MR

program.• User input requirements: target sequence and associated

MTZ file.

Page 3: Bulk Model Construction and Molecular Replacement in CCP4 Automation Ronan Keegan, Norman Stein, Martyn Winn

Overview

Page 4: Bulk Model Construction and Molecular Replacement in CCP4 Automation Ronan Keegan, Norman Stein, Martyn Winn

• Calculate Molecular Weight

• Estimate number of molecules in the a.s.u.

• Parse MTZ file for any relevant parameters

Process Target information

Page 5: Bulk Model Construction and Molecular Replacement in CCP4 Automation Ronan Keegan, Norman Stein, Martyn Winn

• Using target sequence, program consults services based at the EBI for homologous structures based on sequence matching (OCA).

• The top match from the sequence based search is then used for a secondary structure based search using the MSDFold/SSM webservice.

• Using results from above searches, service will also consult PQS at the EBI for any related multimeric structures.

• As an additional option, the top hits from the search can be aligned using Superpose to construct an ensemble of models to be used at the Molecular Replacement stage.

Searching for Homologous Structures

Page 6: Bulk Model Construction and Molecular Replacement in CCP4 Automation Ronan Keegan, Norman Stein, Martyn Winn

• Once the search stage has been completed all of the associated PDB structure files are retrieved.

• These are then manipulated in several different ways to create a plethora of possible models:– 1) PDB Clipping (Pdbcur, Pdbset, Coord_format):

• Waters and hydrogens are removed• Any anomalies in the structure file such as empty

fields are corrected (e.g. missing chain identifiers)• Select most probable confirmations• Individual chains are extracted

Model Construction

Page 7: Bulk Model Construction and Molecular Replacement in CCP4 Automation Ronan Keegan, Norman Stein, Martyn Winn

– 2) Molrep

• Uses own sequence alignment to prune the side chains.

• Side chains are stripped to lowest common parts.

– 3) Chainsaw (Norman Stein)

• Input sequence alignment used to strip side chains.

• More severe pruning than Molrep: “mixed model”.

• Can be given many possible alignments to create different models from the same structure.

• Can use sophisticated sequence aligning such as PSI-Blast and FFAS.

Model Construction

Page 8: Bulk Model Construction and Molecular Replacement in CCP4 Automation Ronan Keegan, Norman Stein, Martyn Winn

• A cluster or HPC resource spawns multiple MR jobs each taking one of the constructed models along with the target structure data.

• Phaser/Amore/Molrep can all be used to do the MR.• Phaser used for the Ensemble of top hits.• If and when the MR program fits the model structure to the

target data the resulting PDB file is processed using Refmac to asses whether it is likely to refine.

• Results are then provided to the user for all of the top scoring models.

• User can retrieve the refined structures along with any of the associated log files.

Molecular Replacement

Page 9: Bulk Model Construction and Molecular Replacement in CCP4 Automation Ronan Keegan, Norman Stein, Martyn Winn

Jobs can be submitted via the e-HPTX portal to the Daresbury e-HTPX computational resources (cluster or condor pool) or, if the user has a Grid Certificate, to the UK National Grid Resources.

Users can monitor the job results as they are produced via a web page hosted on the e-HTPX server machine and they are notified by email when their job is complete.

Refined structure files are made available to user for downloading upon completion.

First external user as of a couple of days ago!

e-HTPX

Page 10: Bulk Model Construction and Molecular Replacement in CCP4 Automation Ronan Keegan, Norman Stein, Martyn Winn
Page 11: Bulk Model Construction and Molecular Replacement in CCP4 Automation Ronan Keegan, Norman Stein, Martyn Winn
Page 12: Bulk Model Construction and Molecular Replacement in CCP4 Automation Ronan Keegan, Norman Stein, Martyn Winn

Target Reso a.s.u. Hits from OCA / SSM

Top hit (JCSG model)

Phaser LLG Refmac Rfree RMSD final model

1vrd 2.2 2 x 482 37 / 0 57% (57%)

1eep_A PDBCLP 1634 (1zfj_A PDBCLP 964)

1eep_A PDBCLP 33.6 1eep_B MOLREP 40.7 (1zfj_A PDBCLP 40.8)

1vrg 2.9 6 x 515 30 / 0 61% (58%)

1xnv_A PDBCLP 9498 (1on3 hexamer 7269)

1xo6_F PDBCLP 34.8 1xnv MULTMR 35.0 1xnv_B MOLREP 51.9 (1on3 hexamer 37.9)

JCSG Targets

N.B. good homologues available

Currently working through more challenging examples …

Page 13: Bulk Model Construction and Molecular Replacement in CCP4 Automation Ronan Keegan, Norman Stein, Martyn Winn

Other points

• Program can also be run on a single machine in a scaled-down fashion.

• Can be run from the command line.• Easy to swap out Phaser and run Amore, Molrep

or other MR program instead.• Modularised - Model construction can be run on

its own. • Other model generating methods can easily be

inserted.

Page 14: Bulk Model Construction and Molecular Replacement in CCP4 Automation Ronan Keegan, Norman Stein, Martyn Winn

Future Plans

• Make it smarter and quicker.

• Use better sequence alignment methods such as PSIBlast, FFAS.

• Use Norman’s Chainsaw program as an extra model creation method.

• Incorporate Norman’s Amore wrapper.

• Integrate it into Graeme’s XIA project – make use of scheduler code wrappers & provide a Model Generation module for XIA-MR.