4
www.ccdc.cam.ac.uk The newsletter for The Cambridge Crystallographic Data Centre Crystal line November 2011 a material, the consequences can be severe. Late stage failure is extremely costly and problems encountered after launch can be devastating. In the widely reported case of ritonavir, Abbott experienced manufacturing problems and were no longer able to produce the original form of the drug. The human costs started to mount as the drug was in wide use as part of antiretroviral treatment. The case was finally resolved as a more stable polymorphic form was identified, and new production and distribution methods adopted. The case cost Abbott hundreds of millions to resolve, and an estimated $250m in lost sales in 1998 alone. Another well known case is that of the early- onset Parkinson’s disease drug rotigotine. Rotigotine was filed in 2003 as a substance that doesn’t show polymorphism and was administered successfully as ‘Neupro’ skin patches. However, in 2008 dendritic structures were observed in these patches. A new form had crystallised and this affected the efficacy of the patches. The product had to be withdrawn, affecting the improved quality of life many patients had experienced from the sustained release patches. The Crystal Form Consortium In 2008, a number of leading pharmaceutical and agrochemicals organisations partnered with scientists from the CCDC to address the lack of predictive analytics tools for solid form development. The aim was to develop new tools to inform on polymorphism and aid co-former design, based on the knowledge contained in the millions of intermolecular interactions stored within the Cambridge Structural Database (CSD). Guided by member priorities and CCDC expertise in data mining, the Consortium developed Solid Form Informatics tools, focusing initially on polymorphism. Polymorphism assessment The result of the Consortium efforts is the CSD Solid Form Suite. In particular, the suite enables knowledge-based polymorphism assessment based on a statistical analysis of hydrogen bonding patterns. The new H-bond Propensity tool predicts hydrogen bonding outcomes, using the knowledge of intermolecular interactions contained within the CSD. By understanding how the environment of a functional group affects its ability to participate in intermolecular interactions, it is possible to predict H-bonding outcomes in target structures. Relevant structures are abstracted from the database and used to build a predictive model.Values for the propensity for a hydrogen bond to form between two functional groups can then be calculated. High propensity values indicate likely intermolecular interactions, whereas low propensity hydrogen bonds can indicate issues with stability. Combining these propensities for hydrogen bond formation with a statistical model that captures information about how often a functional group participates in an H-bond allows the in-silico generation of chemically reasonable alternative crystal forms. The resultant view of the solid state landscape of an active ingredient addresses questions such as how likely polymorphism is and whether there is the possibility of a more stable form. Consider temozolomide (Figure 1), used to treat astrocytoma, and marketed under the brand name Temodal by Schering-Plough. The calculated propensity/participation chart (Figure 2) shows that the Form II crystal structure (indicated by a pink dot) is ranked well on both propensity and participation axes. Introducing the CSD Solid Form Suite Knowledge-based development and risk assessment Figure 1.Temozolomide, marketed under the brand name Temodal by Schering-Plough. How can data mining help understand the solid form behaviour of an active ingredient? How many polymorphs are there for a given active ingredient? What is the risk that there is a more stable form? How do I find a form with improved properties? CSD Solid Form Suite has been developed to address these and other questions, using a Solid Form Informatics approach. Solid Form Informatics Solid Form Informatics supports key decisions about development routes of solid formulations of drugs, agrochemicals and molecular materials. It provides knowledge-based assessment of potential polymorphism and associated manufacturing risks, as well as information vital to reformulation and the tuning of property profiles. Solid Form Informatics distils the vast amount of structural data available today into targeted and actionable information. It is poised to become a powerful new asset supporting a Quality by Design approach and its adoption can help significantly reduce the risk of late stage failure that has been so costly to the industry. The cost of failure If there are gaps in the understanding of polymorphism and the solid form behaviour of Figure 2. The propensity/participation chart for temozolomide provides an easily interpretable view of the solid form landscape aiding both understanding and communication. continued overleaf »

Crystal line The newsletter for The Cambridge ... · expertise in data mining, the Consortium developed Solid Form Informatics tools, focusing initially on polymorphism. Polymorphism

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Crystal line The newsletter for The Cambridge ... · expertise in data mining, the Consortium developed Solid Form Informatics tools, focusing initially on polymorphism. Polymorphism

www.ccdc.cam.ac.uk

The newsletter for The Cambridge Crystallographic Data Centre

CrystallineMay 2010

CrystallineNovember 2011

a material, the consequences can be severe.Late stage failure is extremely costly andproblems encountered after launch can bedevastating. In the widely reported case ofritonavir, Abbott experienced manufacturingproblems and were no longer able to producethe original form of the drug. The human costsstarted to mount as the drug was in wide useas part of antiretroviral treatment. The case wasfinally resolved as a more stable polymorphicform was identified, and new production anddistribution methods adopted. The case costAbbott hundreds of millions to resolve, and anestimated $250m in lost sales in 1998 alone.Another well known case is that of the early-onset Parkinson’s disease drug rotigotine.Rotigotine was filed in 2003 as a substancethat doesn’t show polymorphism and wasadministered successfully as ‘Neupro’ skinpatches. However, in 2008 dendritic structureswere observed in these patches. A new formhad crystallised and this affected the efficacy ofthe patches. The product had to be withdrawn,affecting the improved quality of life manypatients had experienced from the sustainedrelease patches.

The Crystal Form ConsortiumIn 2008, a number of leading pharmaceuticaland agrochemicals organisations partnered withscientists from the CCDC to address the lack ofpredictive analytics tools for solid formdevelopment. The aim was to develop new toolsto inform on polymorphism and aid co-formerdesign, based on the knowledge contained in themillions of intermolecular interactions storedwithin the Cambridge Structural Database (CSD).

Guided by member priorities and CCDCexpertise in data mining, the Consortiumdeveloped Solid Form Informatics tools,focusing initially on polymorphism.

Polymorphism assessmentThe result of the Consortium efforts is the CSDSolid Form Suite. In particular, the suite enablesknowledge-based polymorphism assessment

based on a statistical analysis of hydrogenbonding patterns. The new H-bond Propensitytool predicts hydrogen bonding outcomes,using the knowledge of intermolecularinteractions contained within the CSD. Byunderstanding how the environment of afunctional group affects its ability to participatein intermolecular interactions, it is possible topredict H-bonding outcomes in targetstructures. Relevant structures are abstractedfrom the database and used to build apredictive model. Values for the propensity for ahydrogen bond to form between two functionalgroups can then be calculated. High propensityvalues indicate likely intermolecularinteractions, whereas low propensity hydrogenbonds can indicate issues with stability.Combining these propensities for hydrogenbond formation with a statistical model thatcaptures information about how often afunctional group participates in an H-bondallows the in-silico generation of chemicallyreasonable alternative crystal forms. Theresultant view of the solid state landscape ofan active ingredient addresses questions suchas how likely polymorphism is and whetherthere is the possibility of a more stable form.

Consider temozolomide (Figure 1), used to treatastrocytoma, and marketed under the brandname Temodal by Schering-Plough. Thecalculated propensity/participation chart(Figure 2) shows that the Form II crystalstructure (indicated by a pink dot) is rankedwell on both propensity and participation axes.

Introducing the CSD Solid Form SuiteKnowledge-based development and risk assessment

Figure 1. Temozolomide, marketed under the brand nameTemodal by Schering-Plough.

• How can data mining help understand thesolid form behaviour of an active ingredient?

• How many polymorphs are there for a givenactive ingredient?

• What is the risk that there is a more stableform?

• How do I find a form with improvedproperties?

CSD Solid Form Suite has been developed toaddress these and other questions, using a SolidForm Informatics approach.

Solid Form InformaticsSolid Form Informatics supports key decisionsabout development routes of solid formulationsof drugs, agrochemicals and molecularmaterials. It provides knowledge-basedassessment of potential polymorphism andassociated manufacturing risks, as well asinformation vital to reformulation and thetuning of property profiles. Solid FormInformatics distils the vast amount of structuraldata available today into targeted andactionable information. It is poised to become apowerful new asset supporting a Quality byDesign approach and its adoption can helpsignificantly reduce the risk of late stage failurethat has been so costly to the industry.

The cost of failureIf there are gaps in the understanding ofpolymorphism and the solid form behaviour of

Figure 2. The propensity/participation chart for temozolomideprovides an easily interpretable view of the solid formlandscape aiding both understanding and communication.

continued overleaf

»

Page 2: Crystal line The newsletter for The Cambridge ... · expertise in data mining, the Consortium developed Solid Form Informatics tools, focusing initially on polymorphism. Polymorphism

www.ccdc.cam.ac.uk

A common problem during single crystal x-raydiffraction experiments is the unnecessarycollection of datasets for crystals thatsubsequently turn out to be starting materialsor a reaction by-product rather than theintended chemical sample. This can mean manywasted hours of diffractometer time collectingdata to determine a crystal structure that isalready known and published. The problem canbe avoided by checking the unit cell of thesample against the CSD prior to the collectionof a full dataset. A reduced unit cell searchagainst the CSD will quickly and accuratelyidentify if there are any publically availablestructures with similar cells.

A new tool called CellCheckCSD has now beendeveloped specifically for the purpose ofperforming automated unit cell checks againstthe CSD during data collection. This programcomes pre-packaged with the reduced cellinformation in the CSD as a stand-aloneinstaller. CellCheckCSD runs as an independentcommand-line tool which means that the fullCSD System does not need to be installed onthe diffractometer PC and the user does noteven need a CSDS licence.

CellCheckCSD was developed as part of acollaboration with Agilent Technologies toprovide the search in an automated fashion

through the program CrysAlisPro. This automatednature means that as soon as a unit cell hasbeen determined (or updated) within CrysAlisPro,the software can perform a unit cell checkwithout any user intervention. Any CSDmatches found can be easily viewed withinCrysAlisPro or output to WebCSD or Mercury forfurther visualisation.

How to get it

CellCheckCSD is a free service that has been developed for automated use.

To download an installer for CellCheckCSD freeof charge, users simply need to complete abasic registration step on the CCDC website:http://www.ccdc.cam.ac.uk/free_services/cellcheckcsd/

Dr Pete Wood, Research and ApplicationsScientist

A Free New Tool for Automated Reduced Cell Checking

Deposition of structure factorsStructure factors are now being acceptedalong with CIFs for deposition to the CSD via our web-based deposition form at:http://www.ccdc.cam.ac.uk/services/structure_deposit

Crystal structure coordinates are only aninterpretation of the underlying electrondensity. The central storage of the associatedstructure factors alongside the CIF provides amore complete, unchanging record of theexperimental data. With the availability ofstructure factors the crystallographiccommunity will be able to return topreviously refined structures and re-processthem using up-to-date refinementtechniques. An example is thePLATON/SQUEEZE methodology for

modelling disordered solvent or counter-ionmoieties. Published crystal structures could berevisited using this methodology to producealternative interpretations of the structuralmodel.

"The deposition of structure factors to the CSD isextremely welcome and will greatly facilitate thework of reviewers" said Sylvain Bernès, co-editorActa Crystallographica Section E. "Archivedstructure factors also promise to be a rich sourceof data for research related to statistical bias inx-ray structures and in the detection ofsystematic errors in data collections".

The deposition of structure factors will alsohelp guard against fraud. Recently there havebeen a number of high-profile retractions ofpapers from the peer-reviewed literature. These

cases have typically involved manualalterations to cell constants and atom typesto produce what appeared to be genuinestructure determinations of new compounds.Many of these problems could not have beendiscovered without the availability ofstructure-factor files and the InternationalUnion of Crystallography (IUCr) now stronglyencourages the provision of machine-readableCIF and structure-factor files for everysubmitted structure. Further information onpublication standards can be found on theIUCr website:http://www.iucr.org/index.html/leading-article/2011/2011-06-02

Dr Gary Battle, Marketing andCommunications Manager

However, the observed structure is not wellseparated from other possible structures,indicating that there may be competinghydrogen bond pairings leading to alternative,stable structures. Indeed, looking in the CSD wefind that there are a further two polymorphs oftemozolomide. Each polymorph is a Z’ = 2structure and the hydrogen bond pairingsobserved in these polymorphs are found to bethose indicated by the cluster of trianglessurrounding the Form II structure.

Calculation of the likelihood of hydrogen bondformation also has applications in co-crystaldesign. A comparison of the propensity valuescalculated for homo-molecular interactionswith hetero-molecular interactions revealswhether or not interactions between the activeingredient and coformer are likely, allowingrefinement of experimental strategies.

AvailabilityThe CSD Solid Form Suite helps scientists makebetter informed decisions and reduces the risks

involved in pharmaceutical and agrochemicaldevelopment. The CSD Solid Form Suite isavailable now as part of the 2012 release of theCSD System.

For more information please [email protected]. White papers and casestudies are also available from the CCDC website.

Dr Gary Battle, Marketing and CommunicationsManager, Dr Gerhard Goldbeck, MarketingConsultant

»

Illustration of CellCheckCSD results in CrysAlisPro

and WebCSD

Page 3: Crystal line The newsletter for The Cambridge ... · expertise in data mining, the Consortium developed Solid Form Informatics tools, focusing initially on polymorphism. Polymorphism

major highlight of the CCDCcalendar is the triennial IUCrCongress. This is always anexcellent event for CCDC to

present our ideas and developments, butmost importantly, it’s a fantastic opportunityto listen to our data depositors and users ofthe CSD system and CCDC software.

CCDC were present in Madrid in force, givingour user community the chance to put ourdevelopers, database editors, user supportscientists, research scientists and managementon the spot! For the first time at an IUCrmeeting, we held a CSD Open DiscussionForum. For 90 minutes, the 50 attendees raiseda wide variety of topics. Strong support wasshown for our decision to accept structurefactors for archive and our strategy to firstencourage their deposition, and then create aposition where the deposition of structurefactors along with coordinate data becomesthe norm. This was reinforced by our usersurvey: Although only 16% of depositors at the

www.ccdc.cam.ac.uk

A

The CCDC’s Gary Battle pictured with our poster prize winners from the IUCr! Congratulations to Katarzyna A. Solanko, KateDavies and Amol G. Dikundwar

a successful workshop for younger scientists,while Ian Bruno, Gary Battle and Colin Groomco-chaired microsymposia on Validation, ErrorDetection and Fraud Prevention, Applications ofCrystal Structure Information in ChemicalEducation, and Crystallography in Disease andTherapy, respectively. CCDC staff also gave sixpresentations, covering crystal structureprediction (Aurora Cruz-Cabeza), Uses of crystalstructure information in chemical education(Gary Battle and Frank Allen), Structurevalidation (Matthew Lightfoot), Polymorphismof pharmaceutical materials (Pete Wood) andBusiness models for database organisations(Colin Groom).

A conference highlight was undoubtedly theCelebration of the life of Lodovico Riva diSanseverino, founder of the EriceCrystallography Schools. The session wasorganised by Paola Spadon, Frank Allen andKeiichiro Ogawa and attended by Lodovico’s sonand daughter in law, Clemente and Alessandra.The session began with reminiscences ofLodovico’s career, including several periodsspent at the embryonic CCDC in the 1960s and1970s. It was a delight to hear recollections ofthese periods from Frank Allen, and also to hearhow the Erice schools influenced one of ourresearch scientists, Aurora Cruz-Cabeza.Lodovico always encouraged music and singing‘mixers’ at Erice, so the session wasmagnificently rounded off with six favouritesongs from the Erice Songbook, amid a fewtears, many smiles and a good deal of laughter.

Dr Colin Groom, Executive Director

forum had deposited structure factors, 80%would do so in the future.

We learned of the desire for additionalinformation to be captured in the CSD, theneed to be clearer about our editing process,how we complicate structure analysis bychanging atom labels which can’t then bealtered, and received many suggestions for CSDsystem improvements. We can’t addresseverything immediately, but are working hardon new internal informatics systems and a newformat for the CSD itself. A lively discussion onthe importance of supporting Apple Macs,tablet devices and smartphones elicited somediverse views. We think it best to devote asmuch of our resource as possible towards thescience required to improve the CSD system,but will keep supporting as wide a range ofplatforms as we can. We will also hold a seriesof webinars, addressing areas where users feltthey would benefit from more training.

CCDC also made a significant contribution tothe scientific program. Pete Wood co-organised

CCDC at the IUCr

Facebook atwww.facebook.com/ccdc.cambridge and onTwitter at www.twitter.com/ccdc_cambridge.When we’ve reached 500 followers onFacebook we’ll be giving away a 16GB AppleiPad to one lucky follower, so what are youwaiting for? Come and get involved in ouronline community!

Lauren Thomas, Account and Marketing Manager

Since February, the CCDC has been onFacebook and Twitter, using both socialmedia platforms as a way of keeping ourfollowers up to date with our latest news anddevelopments. We tweet and post statusupdates about our software releases andupdates, conferences that we are attending,interesting research articles and structures, aswell as news about the CCDC staff! If you’renot following us already, you can find us on

Come and find us on Facebook and Twitter!

Page 4: Crystal line The newsletter for The Cambridge ... · expertise in data mining, the Consortium developed Solid Form Informatics tools, focusing initially on polymorphism. Polymorphism

on Relibase 3.1Spotlight

www.ccdc.cam.ac.uk

The CCDC team frequently publish results oftheir research, which is often the work ofcollaboration with industrial or academicscientists.You can find the full list of our publications at www.ccdc.cam.ac.uk/publications.Here are our most recent titles, publishedsince May 2011.The extent of enthalpy-entropy compensation inprotein-ligand interactions T. S. G. Olsson, J. E Ladbury, W. R. Pitt, M. A.Williams, Protein Sci. (2011) 20, 1607-1618.10.1002/pro.692

Designing a new Diels-Alderase: A combinatorialsemi-rational approach including dynamicoptimization M. Linder, A. Johansson, T. Olsson, J. Liebeschuetz, T.Brinck, J. Chem. Inf. Mod. (2011) 51, 1906–1917.10.1021/ci200177d

Structure prediction, disorder and dynamics in aDMSO solvate of carbamazepine A. J. Cruz-Cabeza, G. M. Day, W. Jones, Phys. Chem.Chem. Phys. (2011) 13, 12808-12816.10.1039/c1cp20927b

Docking performance of fragments and drug-likecompounds M. Verdonk, I. Giangreco, R. Hall, O. Korb, P.Mortensen, C. Murray, J. Med. Chem. (2011) 54,5422–5431. 10.1021/jm200558u

Deducing chemical structure from crystallo-graphically determined atomic coordinates

I. J. Bruno, G. P. Shields, R. Taylor, Acta Cryst. B(2011) 67, 333-349.10.1107/S0108768111024608

New Software for Statistical Analysis of CSD Data R. A. Sykes, P. McCabe, F. H. Allen, G. M. Battle, I. J.Bruno, P. A. Wood, J. Appl. Cryst. (2011) 44, 882-886. 10.1107/S0021889811014622

Teaching 3D structural chemistry using crystalstructure databases 3: The Cambridge StructuralDatabase System – database content and accesssoftware in educational applications G. M. Battle, F. H. Allen, G. M. Ferrence, J. Chem. Ed.(2011) 88, 886-890. 10.1021/ed1011019

Teaching 3D structural chemistry using crystalstructure databases 4: Advanced examples ofdiscovery-based learning G. M. Battle, F. H. Allen, G. M. Ferrence, J. Chem. Ed.(2011) 88, 891-897. 10.1021/ed1011025

Accelerating Molecular Docking Calculations UsingGraphics Processing Units O. Korb, T. Stuetzle, T. Exner, J. Chem. Inf. Model.(2011) 51, 865-876. 10.1021/ci100459b

The Effect of Pressure on the Crystal Structure ofBianthrone R. D. L. Johnstone, D. R. Allan, A. R. Lennie, E. Pidcock,R. Valiente, F. Rodriguez, J. Gonzalez, J. E. Warren, S.Parsons, Acta Cryst. B (2011) 67, 226-237.10.1107/S0108768111009657

Using the outer coordination sphere to tune thestrength of metal extractants R. S. Forgan, B. D. Roach, P. A. Wood, F. J. White, J.Campbell, D. K. Henderson, E. Kamenetzky, F. E.McAllister, S. Parsons, E. Pidcock, P. Richardson, R. M.Swart, P. A. Tasker, Inorg. Chem. (2011) 50, 4515-4522. 10.1021/ic200154y

An automated method for consistent helixassignment using turn information O. Koch, J. C. Cole, Proteins, Structure, Function andBioinformatics (2011) 79, 1416-1426.10.1002/prot.22968

The Cambridge Structural Database: Experimental3D information on small molecules is a vitalresource for interdisciplinary research and learning C. R. Groom, F. H. Allen, WIREs Comp. Mol. Sci.(2011) 1, 368-376. 10.1002/wcms.35

Computational design of a lipase for catalysis of aDiel-Alder reaction M. Linder, A. Hermansson, J. Liebeschuetz, T. Brinck J.Mol. Mod. (2011) 17, 833-849. 10.1007/s00894-010-0775-8

The basis for target-based virtual screening: Proteinstructures.J. C. Cole, O. Korb, T. S. G. Olsson, J. W.Liebeschuetz,. Virtual Screening. Principles,Challenges and Practical Guidelines, Eds. ChristophSotriffer. Pubs. Wiley, 2011.

Date Conference, meeting or event Venue Activity

5-9 February 2012 Lorne Conference on Protein Structure Lorne, Australia Attendance

25-29 March 2012 Spring ACS Meeting San Diego, CA, USA Talk, Exhibition

16-19 April 2012 Spring BCA Meeting Warwick, UK Exhibition

10-15 June 2012 Gordon Research Conference on Crystal Engineering Waterville Valley, NH, USA Attendance

17-20 June 2012 Fourth European Conference on Crystal Growth Glasgow, UK Attendance

15-18 July 2012 SLA Annual Conference & INFO-EXPO Chicago, IL, USA Attendance

Events Over the coming months you can come and meet us at various meetings and events around the world.

structures, or large databases if updatedcontinually as new structures are determined.The command line interface is more powerfuland makes it possible to port large backlogs ofin-house protein structural data, minimising theamount of manual editing required.

That’s great! I have been looking for aneasy way of storing and organising my in-house structures. When will version 3.1 beavailable?Version 3.1 of Relibase+ was released in June soyou can access it now. Please get in contactwith [email protected].

Wait a minute, I remember Relibase+being really difficult to install…Not to worry. We have also overhauled theinstaller in version 3.1. It is now trivial to installRelibase+.

Dr Tjelvar Olsson, Research and ApplicationsScientist

combine these using boolean logic. There is alsoan easy to use sketcher interface for searchingfor protein-ligand interactions. Both 2D and 3Dconstraints can be added to the sketchersearches as well as protein secondary structureconstraints. The list goes on. However, one ofthe most important features of Relibase+ is thatit allows you to create your own databases toarchive your in-house structures. These can besearched either independently of or inconjunction with the data derived from the PDB.

I remember struggling to create in-housedatabases when using Relibase+ in thepast…You and many others… This is why, in version3.1, we have been working on improving theuser experience of creating in-house databases.Both the web-based graphical user interface aswell as the command line interface has beenoverhauled. The graphical user interface makes ittrivial to create small databases of handfuls of

Remind me again what Relibase+ is used for?Relibase+ allows you to perform powerfulsearches and analyses of protein-ligandinteraction space. This is particularly useful duringproject initialisation when you quickly need tounderstand the intricacies of a new proteintarget. Many people also use it to derive generalknowledge about protein-ligand interactions.

I use the PDB to look at protein structures.What advantages does Relibase+ offer?One of the most powerful aspects of Relibase+is its ability to search for similar cavities in asequence independent manner. This can be usedto generate chemical starting points from partsof ligands bound to sub-pockets similar to thesub-pocket of interest. It can also be used to geta handle on target cross-activity.

That sounds extremely useful. DoesRelibase+ offer anything else?Relibase+ offers lots of things. The ability toorganise your searches into hit-lists and to

CCDC Publications May 2011 to Nov 2011