8
articles nature structural biology • volume 5 number 7 • july 1998 585 1 Protein Structure Group, Department of Chemistry, University of York, Heslington, York YO1 5DD, UK. 2 Department of Molecular Genetics, Institute of Ophthalmology, University College London, 11-43 Bath Street, London EC1V 9EL, UK. Correspondence should be addressed to K.S.W. email: [email protected] The biosynthesis of natures most complex cofactor 1 comprises ~30 enzyme-mediated steps 2–5 , a requirement that approximates to about 1% of the proteins in an average bacterial genome 6,7 . This complex pathway can be separated into three distinct parts. The CobI pathway results in the synthesis of the corrin ring com- ponent, cobinamide, from the ubiquitous tetrapyrrole primogen- itor uropophyrinogen III (uro’gen III). The CobII pathway results in the synthesis of the lower axial ligand, dimethylbenzimidazole (DMB). The CobIII pathway results in the assembly of the final coenzyme from cobinamide, DMB and phosphoribosyl (Fig. 1) 6 . In parallel to what is observed in heme biosynthesis 8 , the CobI segment of cobalamin synthesis comtains independent aerobic and anaerobic branches. These CobI pathways differ in their timing of cobalt insertion and the requirement for molecular oxygen 9,10 . In the well studied aerobic pathway of Pseudomonas denitrificans, cobalt is chelated into hydrogenobyrinic acid a,c-diamide to gener- ate cob(II)yrinic acid a,c-diamide, a late corrin pathway intermedi- ate 4,9 . In contrast, the anaerobic pathways of Salmonella typhimurium and Bacillius megaterium chelate cobalt at a much earlier intermediate, precorrin-2 (Fig. 1) 11 . Although all the inter- mediates of the aerobic CobI pathway have been elucidated, few intermediates on the anaerobic route are known. Many of the aero- bic, Cob, enzymes share a high degree of similarity with the anaero- bic, Cbi, enzymes suggesting that although independent, the two pathways are broadly similar (Fig. 1). One of the unique features of cobalamin biosynthesis is the addition of eight S-adenosyl-L-methionine (AdoMet) methyl groups to the tetrapyrrole framework during corrin construc- tion. The methyl groups are added by the action of six separate transmethylases, which are more closely related to one another than to other known sequences, suggesting that they have evolved from a common ancestral methylase. The aerobic/anero- bic methylases for the relevant carbons, listed in order of path- way addition 7 , are CobA/CysG at C2 and C7, CobI/CbiL at C20, CobJ/CbiH at C17, CobM/CbiF at C11, the aerobic CobF at C1, and CobL/CbiE at C5 and C15. Thus the structure elucidation of any one of them not only provides insight into the mechanism of The X-ray structure of a cobalamin biosynthetic enzyme, cobalt-precorrin-4 methyltransferase Heidi L. Schubert 1 , Keith S. Wilson 1 , Evelyne Raux 2 , Sarah C. Woodcock 2 and Martin J. Warren 2 Biosynthesis of the corrin ring of vitamin B 12 requires the action of six S-adenosyl-L-methionine (AdoMet) dependent transmethylases, closely related in sequence. The first X-ray structure of one of these, cobalt- precorrin-4 transmethylase, CbiF, from Bacillus megaterium has been determined to a resolution of 2.4 Å. CbiF contains two α/β domains forming a trough in which S-adenosyl-L-homocysteine (AdoHcy) binds. The location of AdoHcy and a number of conserved residues, helps define the precorrin binding site. A second crystal form determined at 3.1 Å resolution highlights the flexibility of two loops around this site. CbiF employs a unique mode of AdoHcy binding and represents a new class of transmethylase. Fig.1 The vitamin B 12 biosynthetic pathway from uro’gen III, highlight- ing the structures of precorrin-4 and -5 in the anaerobic and aerobic pathways. The anaerobic cobalt-precorrin-4 methylase (CbiF) is most closely related to its aerobic homologue precorrin-4 methylase (CobM), but is also similar to other tetrapyrrole methylases such as CysG and CobA. Side chain notation; A = acetate, P = proprionate.

The X-ray structure of a cobalamin biosynthetic enzyme, cobalt-precorrin-4 methyltransferase

Embed Size (px)

Citation preview

Page 1: The X-ray structure of a cobalamin biosynthetic enzyme, cobalt-precorrin-4 methyltransferase

articles

nature structural biology • volume 5 number 7 • july 1998 585

1Protein Structure Group, Department of Chemistry, University of York, Heslington, York YO1 5DD, UK. 2Department of Molecular Genetics, Institute ofOphthalmology, University College London, 11-43 Bath Street, London EC1V 9EL, UK.

Correspondence should be addressed to K.S.W. email: [email protected]

The biosynthesis of natures most complex cofactor1 comprises~30 enzyme-mediated steps2–5, a requirement that approximatesto about 1% of the proteins in an average bacterial genome6,7.This complex pathway can be separated into three distinct parts.The CobI pathway results in the synthesis of the corrin ring com-ponent, cobinamide, from the ubiquitous tetrapyrrole primogen-itor uropophyrinogen III (uro’gen III). The CobII pathway resultsin the synthesis of the lower axial ligand, dimethylbenzimidazole(DMB). The CobIII pathway results in the assembly of the finalcoenzyme from cobinamide, DMB and phosphoribosyl (Fig. 1)6.

In parallel to what is observed in heme biosynthesis8, the CobIsegment of cobalamin synthesis comtains independent aerobic andanaerobic branches. These CobI pathways differ in their timing ofcobalt insertion and the requirement for molecular oxygen9,10. Inthe well studied aerobic pathway of Pseudomonas denitrificans,cobalt is chelated into hydrogenobyrinic acid a,c-diamide to gener-ate cob(II)yrinic acid a,c-diamide, a late corrin pathway intermedi-ate4,9. In contrast, the anaerobic pathways of Salmonellatyphimurium and Bacillius megaterium chelate cobalt at a muchearlier intermediate, precorrin-2 (Fig. 1)11. Although all the inter-mediates of the aerobic CobI pathway have been elucidated, fewintermediates on the anaerobic route are known. Many of the aero-bic, Cob, enzymes share a high degree of similarity with the anaero-bic, Cbi, enzymes suggesting that although independent, the twopathways are broadly similar (Fig. 1).

One of the unique features of cobalamin biosynthesis is theaddition of eight S-adenosyl-L-methionine (AdoMet) methylgroups to the tetrapyrrole framework during corrin construc-tion. The methyl groups are added by the action of six separatetransmethylases, which are more closely related to one anotherthan to other known sequences, suggesting that they haveevolved from a common ancestral methylase. The aerobic/anero-bic methylases for the relevant carbons, listed in order of path-way addition7, are CobA/CysG at C2 and C7, CobI/CbiL at C20,CobJ/CbiH at C17, CobM/CbiF at C11, the aerobic CobF at C1,and CobL/CbiE at C5 and C15. Thus the structure elucidation ofany one of them not only provides insight into the mechanism of

The X-ray structure of a cobalaminbiosynthetic enzyme, cobalt-precorrin-4methyltransferaseHeidi L. Schubert1, Keith S. Wilson1, Evelyne Raux2, Sarah C. Woodcock2 and Martin J. Warren2

Biosynthesis of the corrin ring of vitamin B12 requires the action of six S-adenosyl-L-methionine (AdoMet)dependent transmethylases, closely related in sequence. The first X-ray structure of one of these, cobalt-precorrin-4 transmethylase, CbiF, from Bacillus megaterium has been determined to a resolution of 2.4 Å. CbiFcontains two α/β domains forming a trough in which S-adenosyl-L-homocysteine (AdoHcy) binds. The location ofAdoHcy and a number of conserved residues, helps define the precorrin binding site. A second crystal formdetermined at 3.1 Å resolution highlights the flexibility of two loops around this site. CbiF employs a uniquemode of AdoHcy binding and represents a new class of transmethylase.

Fig.1 The vitamin B12 biosynthetic pathway from uro’gen III, highlight-ing the structures of precorrin-4 and -5 in the anaerobic and aerobicpathways. The anaerobic cobalt-precorrin-4 methylase (CbiF) is mostclosely related to its aerobic homologue precorrin-4 methylase (CobM),but is also similar to other tetrapyrrole methylases such as CysG andCobA. Side chain notation; A = acetate, P = proprionate.

Page 2: The X-ray structure of a cobalamin biosynthetic enzyme, cobalt-precorrin-4 methyltransferase

precorrin methylation but also provides a model on which tobase the structure of the other precorrin methylases.

Based on its homology with the aerobic C11 transmethylase,CobM, CbiF is assumed to methylate C11 of cobalt-precorrin-4during the anaerobic biosynthesis of cobalamin, generating pre-corrin-5. The structure of its substrate, cobalt-precorrin-4, hasonly recently been determined and is quite different from that ofthe aerobic precorrin-4 (ref. 12).

A His-tagged form of CbiF from B. megaterium was cloned,expressed and purified and the X-ray structure of crystals grownin 1.2 M phosphate were refined to an R-factor of 20.4% at a res-olution of 2.4 Å. Cleavage of the His-tag allowed crystals of aphosphate-free form to be produced, and this structure has beenrefined at a resolution of 3.1 Å to an R-factor of 19.3%.

Structure of CbiF in phosphateCbiF is composed of two α/β domains linked by a single coilforming a kidney shaped molecule (Fig. 2a,b). Both domainscontain a five stranded β-sheet flanked by four α-helices, butthere is no topological similarity between the two domains. Theentire structure follows a β-α repeating pattern with the singleexception of a β-hairpin late in the C-terminal domain (β8-β9).The parallel β-sheet in the N-terminal domain follows a 32415topology, a topology seen before only in the six-stranded sheet offructose permease subunit IIb13 (324156). The C-terminal

articles

586 nature structural biology • volume 5 number 7 • july 1998

domain contains a mixed sheet with 12534 topology. This strandorder has been previously observed in protein tyrosine phos-phatases, but with differences in direction of two of the strands.

Although there is only a monomer in the asymmetric unit,both biochemical and structural evidence suggests that theenzyme exists as a dimer14. The nearest crystallographicmonomer shares a substantial 31% of its surface area with theoriginal molecule (QUANTA15). Indeed due to the relative twistof the two domains, the two C-terminal five-stranded sheetscombine to generate a ten-stranded β-sheet (Fig. 2c). There are36 direct protein–protein hydrogen bonds in the dimer interface.Of the 22 residues involved, six form hydrogen bonds throughtheir side chain atoms (Fig. 3a,b) and are completely conservedamong precorrin-4 transmethylases.

Two high density features were ascribed to inorganic phos-phate ions, reflecting the presence of 1.2 M Na/KPO4 in the crys-tallization medium. One phosphate lies within the proposedsubstrate binding site below the AdoHcy molecule within the N-terminal domain, forming hydrogen bonds with three waters,the amide of Thr 101 and the imidazole side chain of His 100.The fourth oxygen of this phosphate does not form any hydrogenbonds and lies 5.5 Å below the sulphur atom of AdoHcy. The sec-ond phosphate is involved in crystallographic lattice contacts,lying between the N-terminus (residues 17–20) and residues161–162 within a loop of the C-terminal domain (β6-αF).

a b

Fig. 2 S-Adenosyl-L-homocysteine(AdoHcy) bound to CbiF. a, Two α/βdomains form a kidney shaped pro-tein with a proposed substrate bind-ing trough in the center whereAdoHcy is bound. AdoHcy is shown inball-and-stick representation withcarbons in yellow, oxygens in red,nitrogens in blue and sulfurs in green.b, Stereo Cα trace of the CbiFmonomer with every tenth residuelabeled. c, In the interface with thenearest crystallographic neighbor31% of the molecular surface isburied to generate the functionaldimer. The β-sheets of the C-terminaldomains are continuous within thedimer and the active sites are orient-ed in opposite directions38,39.

c

Page 3: The X-ray structure of a cobalamin biosynthetic enzyme, cobalt-precorrin-4 methyltransferase

articles

nature structural biology • volume 5 number 7 • july 1998 587

Phosphate-free crystal formThe His-tag was removed from the recombinant CbiF and theresulting protein crystallized in the absence of phosphate in anew space group with tighter crystal packing. The overall foldremains the same. Although the resolution is limited to 3.1 Åseveral significant changes are nevertheless evident, particularlyin the conformations of two surface loops. As expected there areno bound phosphates. The r.m.s. deviation between the Cαatoms of the two forms is 0.66 Å.

Conserved residuesAn amino acid sequence alignment between four CbiF (anaero-bic cobalt-precorrin-4 transmethylase) and three CobM (aerobicprecorrin-4 transmethylase) sequences highlights a number ofconserved regions between ‘precorrin-4’ transmethylases (Fig.3a)14. Thirty-eight residues are identical in all seven proteins andcan be grouped by function. Nine are involved in the binding ofAdoHcy including a glycine rich region GAGPG (residues27–31), Gly 102, Asp 103, Ala 135 and Leu 184. A further six

make up a putative precorrin binding sitebetween residues Ala 53, Ser 55, Leu 79,Arg 98, Glu 112 and Gln 113. The dimerinterface involves six additional conservedresidues, Thr 37, Glu 143, Leu 144, Gln151, Thr 156 and Arg 157. Finally, theremaining 17 lie either in the hydrophobiccore or in several tight turns (Fig. 3a,b).

S-adenosyl-L-homocysteine binding siteAdoHcy is bound in one pocket of a large trough betweenthe N- and C-terminal domains. The ligand lies at the car-boxyl end of the parallel β-sheet in the N-terminaldomain and slightly behind the last loop of the C-termi-nal domain, αH-β10. Residues dispersed throughout thepolypeptide chain contribute both main and side chainatoms to the binding (Fig. 4a,b). AdoHcy bound to CbiFis kinked between the sulfur atom and the sugar ring toplace both the homocysteine backbone and the adenosinering into pockets of the trough, similar to a two prongedplug in a socket.

The binding of AdoHcy to CbiF is quite different fromthose found in other AdoMet-binding proteins. In theDNA and catechol transmethylases the AdoMet is in anextended conformation with a O4S-C4S-C5S-SD torsionangle of 173º (ref. 16). In contrast, for AdoHcy bound toCbiF this angle is 82º (Fig. 4b). This distorted conforma-tion could well assist presentation of the methyl group to

the bulky substrate as it would project into the precorrin bindingsite. Unlike the hydrophobic packing between Phe/Tyr residuesand the nucleotide ligand in the DNA methylases, CbiF containsno large hydrophobic residues forming van der Waals interac-tions with the adenine ring. On one side of the ring, the Cβ of Ser132 packs against the bridge carbons at a distance of 3.4 Å. Theother side is loosely packed (4.5–5.0 Å) against the methylenecarbons of Gln 240. There are a total of 15 hydrogen bondsbetween CbiF and AdoHcy (Fig. 4a). Two support the adenosinering; the carbonyl oxygen of conserved Pro 30 lies 2.8 Å from N6,and the amide nitrogen of partially conserved Ala 213 shares ahydrogen with N1 at a distance of 2.8 Å. The sugar hydroxyls arewithin hydrogen bonding distance (2.6–3.3 Å) of the amidenitrogen of conserved Leu 184, partially conserved Ala 241, andtwo solvent molecules, Wat 519 and Wat 509. The homocysteineportion of the ligand participates in eight hydrogen bonds.Conserved Asp 103 forms hydrogen bonds from both its amidenitrogen (2.7 Å) and carbonyl oxygen (2.8 Å) to the terminal car-boxyl and amine nitrogen of homocysteine respectively. There

Fig. 3 Sequence alignment of CysG, CobM andCbiF. a, Alignment of CbiF with the aerobic pre-corrin-4 transmethylase, CobM, and the methy-lase domain of sirohaem synthase, CysG. Thehighlighted residues are completely conservedin all known ‘precorrin-4 transmethylase’(CobM/CbiF) sequences and have been coloredaccording to function: red — AdoMet/AdoHcybinding, gold — tetrapyrrole binding site, pur-ple — dimerization, green — structural core. b, Structural representation of the functionalrole of conserved residues. The Cα trace is col-ored by conservation among the CbiF methy-lases. Dark blue: seven out of seven sequencesconserved, white: completely unconserved38.Amino acid side chains colored as in Fig. 3a.

a

b

Page 4: The X-ray structure of a cobalamin biosynthetic enzyme, cobalt-precorrin-4 methyltransferase

articles

are hydrogen bonds between the homocysteine moiety and thepartially conserved Thr 131 and Ser 132, through the side chainhydroxyls (Ser 132 OG is 2.6 Å from AdoHcy O1 and Thr 131OG is 2.9 Å from AdoHcy O2). Ser 132 forms a hydrogen bondthough its amide nitrogen at a distance of 2.9 Å from AdoHcyO1. There are three additional hydrogen bonds to the homocys-teine amine from the carbonyl of partially conserved Thr 101(3.2 Å), the carbonyl of Met 106 (2.6 Å) and a solvent molecule(2.9 Å).

CbiF only crystallized in the presence of exogenous AdoMet orAdoHcy, although the derived structure contains only AdoHcy.Previously, despite AdoMet’s reputation for being kineticallyunstable, an RNA methyltransferase structure containing boundAdoMet has been determined even though the molecule was notpresent in the crystallization medium17. Enzymes such as thisRNA methyltransferase appear to preferentially bind and stabi-lize AdoMet over AdoHcy. In contrast CobA, a uro’gen IIImethylase, binds AdoHcy 20 times more tightly than AdoMetdespite the small difference in structure (CH3

+)18. Calorimetricstudies indicate that CbiF also preferentially binds the productAdoHcy (data not shown). The reactive methyl group can bephysically accommodated in the current CbiF model, and theAdoMet complex is expected to be essentially equivalent to theAdoHcy-bound structure. Together these data suggest that CbiFmay promote the breakdown of AdoMet by binding it in a con-formation which favors displacement of the methyl group fromthe sulphonium ion. In principle, AdoHcy could be acting as aproduct inhibitor of the corrin biosynthetic transmethylases,preventing excessive production of vitamin B12 and depletion ofthe C1 pool.

588 nature structural biology • volume 5 number 7 • july 1998

Precorrin binding and catalysisThe most likely binding site for cobalt-precorrin-4 is the large trough in the N-terminal domain (Fig. 4c). Several loopsmake up the walls of the trough. Loopβ2-αB contains conserved residues Ala53 and Ser 55 and, together with a sec-ond loop (β3-αC) containing conservedLeu 79, composes the left side of thetrough. The right side of the trough isformed by loop β4-αD and helix αDcontaining conserved residues Glu 112and Gln 113. The lower surface is linedby residues in strand β4. Several sidechains (Arg 98, His 100, Gln 113, Thr74, Thr 101, Arg 157, Gln 240, Lys 239)

surround this binding site and are available for interactions withthe eight carboxyl side chains of the substrate.

As a substrate, cobalt-precorrin-4 is expected to bind lesstightly to CbiF than tetrapyrrole-derivatized cofactors to theircognate enzymes. In addition, the latter directly ligate the centralmetal ion, for example, a histidine in methionine synthase ligatesthe cobalt ion of methylcobalamin19. In CbiF, no residue is posi-tioned to act as a cobalt ligand. The nearest feasible cobalt-bind-ing residue is the non-conserved His 100, which lies at thebottom of the substrate trough on strand β4. For His 100 to actas a ligand, cobalt-precorrin-4 would have to bind in a perpen-dicular orientation with respect to the plane of the trough. Thepresence of the metal ion is not essential for transmethylation, asobserved for CbiF from S. typhimurium, which can methylateprecorrin-3 in the absence of a central cobalt, albeit with low effi-ciency20. The structure of CbiF leaves few options for cobalt-pre-corrin-4 binding. Carbon 11 on ring C must lie within 3–4 Å ofthe methyl group for direct methyl transfer. Maximalenzyme–substrate contact requires rings A and B to be insertedinto the trough, making salt bridges and hydrogen bonds withmany surrounding residues.

Residues 53–57 and 73–80 on the left-hand side of the troughhave the highest temperature factors in the molecule (between50–70 Å2), indicating structural flexibility. It is possible that a con-formational change occurs on binding of substrate, the enzymeaccepting the tetrapyrrole in an ‘induced fit’ mechanism. This issupported by the differences observed between the two crystalforms. In the 2.4 Å structure, phosphate is bound at the bottom ofthe precorrin trough. In the 3.1 Å structure, the two flexible loops(53–57 and 73–80) move to take up the position previously occu-

a Fig. 4 The AdoHcy binding pocket. a, Hydrogen bonds (2.6–3.3 Å) from surround-ing residues are shown. The 82º torsion anglecan be seen between O4S-C4S-C5S-SD (see (b)for a detail). The red lattice represents unbi-ased 1σ density from an Fo - Fc map generatedbefore AdoHcy was added to the model.Hydrogen bonds from Thr 101 O to the homo-cysteine amine and the loose hydrophobicpacking between the adenosine ring and Gln240 have been removed for clarity. b, Alignment of the AdoHcy and AdoMet lig-ands of HhaI, TaqI (yellow) and CbiF (white)highlighting the 82º torsion angle of O4S-C4S-C5S-SD. c, The electrostatic surface of CbiFhighlights a deep groove between the N- andC-terminal domains just below the AdoHcybinding site (blue, positive; red, negative)40.

bc

Page 5: The X-ray structure of a cobalamin biosynthetic enzyme, cobalt-precorrin-4 methyltransferase

articles

pied by the phosphate and reduce the width of the trough (Fig.5). The Asp 54 Cα moves by up to 6 Å. Its side chain moves by 9Å, pointing away from the proposed active site in the presence ofbound phosphate, but pointing into it in the absence of phos-phate. These movements may well be of functional significance.

Assuming that AdoMet binds in the same conformation asAdoHcy, the position of the methyl group of AdoMet in theactive site would allow the transfer of the methyl group to thecorrin ring. It is known that the transmethylations at C2, C7and C12 proceed with overall inversion of symmetry in accordwith a direct SN2 displacement of the methyl group fromAdoMet. To facilitate this reaction the ring carbon must be acti-vated to function as a nucleophile. Though the cobalt ion mayassist by acting as an electron donor to carbon 11, it is expectedthat the aerobic and anerobic transmethylases have identicalcatalytic mechanisms dueto their high degree of simi-larity. The lack of highlyconserved charged residuessurrounding the bindingsite suggests that catalysis isprobably dominated by thelability of AdoMet and theproximity and orientationof the precorrin rather thana general acid/base mecha-nism.

The alluring prospect offorming an enzyme–sub-strate complex is technicallychallenging because theanaerobic intermediates arelabile and oxygen sensitive2.Several attempts have beenmade to form anenzyme–ligand complex.Crystals soaked in cyano-cobalamin and cobyric aciddid effect a color change(red or pink) in the soakedcrystal but three-dimen-sional structures derivedfrom these crystals did notcontain density indicativeof bound ligand (data notshown). The color change may reflect non-specific bindingsites of the ligand in the lattice or binding at very low occupan-cy. The recently described structure of cobalt-precorrin-4 (ref.12) as well as the identification of the enzymes leading to itsformation, will hopefully assist in this difficult task.

Comparison of CysG and CbiFOf all the cobalamin biosynthetic transmethylases, E. coli CysGis the best characterized. E. coli CysG is a trifunctional enzymethat catalyses uro’gen III transmethylation at positions C2 andC7, NAD+ mediated dehydrogenation and ferrochelation in theproduction of siroheme and vitamin B12. Experimental investi-gations into the catalytic mechanism of the transmethylaseactivity of CysG suggests that the enzyme binds AdoMet cova-lently. However, similar investigations with B. megaterium CbiFrevealed that it binds AdoMet less tightly and non-covalently21.Neither an Fo - Fc map calculated prior to addition of AdoHcyto the model (Fig. 4a) nor the refined electron density indicate

nature structural biology • volume 5 number 7 • july 1998 589

any covalent bond between AdoHcy and CbiF. If, as seems like-ly, the transmethylase region of CysG has a similar structure toCbiF, then the most likely residues for covalent bond formationin CysG would be those homologous to Thr 131 and Ser 132 inCbiF.

Several residues in CysG have been mutated to probe theAdoMet and tetrapyrrole binding sites22; two within theglycine rich AdoMet binding region (Gly 224 and Asp 227),three within the putative substrate binding site (Arg 298, Asp303 and Arg 309) and two additional conserved chargedresidues (Asp 454 and Lys 473). Each mutant was tested for itsability to bind AdoMet and to rescue a cysG deletion strain ofE. coli21. The mutagenesis results can be rationalized by align-ment of the CysG sequence to the CbiF structure, giving struc-tural insight into the effect of the disruptions.

Two CysG mutantscould not bind AdoMet — Gly224Ala and Arg298Leu.The residue equivalent toCysG-Gly 224 in CbiF isGly 29, which forms partof the GAGPG motif com-mon to all the cobalaminbiosynthetic transmethy-lases. The flexibility of thisglycine is crucial for form-ing the tight turn under-neath the nucleotide andchanging this residue toAla would disrupt the fold.The residue equivalent toCysG-Arg 298 in CbiF isArg 98, which is not indirect contact with theAdoMet binding site indi-cating that a larger struc-tural distortion must beresponsible for the lack ofAdoMet binding in theCysG mutant. Though theAsp303Ala mutation didnot disrupt AdoMet bind-ing, the correspondingresidue in CbiF, Asp 103 isinvolved in direct hydro-

gen bonds to AdoHcy through its peptide backbone.Presumably, these were not disrupted by mutation of the sidechain.

Two CysG mutants which bind AdoMet (Asp248Ala andArg309Leu) showed a reduced catalytic efficiency possibly dueto interference with substrate binding. The correspondingresidues in CbiF, Asp54 and Ala109, lie on loops β2-αB and β4-αD which frame the deep trough in the molecule and helpdefine the precorrin-4 binding site. To strengthen this hypoth-esis, CbiF-Asp54 is the residue which moved into the precorrinbinding site in the phosphate-free structure. The additionalmutants (CysG Asp227 and Lys270 which correspond to CbiFAsp32 and Lys73) had no observable effect on AdoMet bindingor catalysis. This is consistent with these residues being distantfrom the active site.

CbiF: a new class of methylaseOther AdoMet dependent transmethylases such as HhaI, TaqI

Fig. 5 Comparison of CbiF in the presence and absence of phosphate. Loops β2-αB and β3-αC occupy alternate conformations in the two crystals forms. CbiFin phosphate buffer is shown in green (phosphate in red) and the phosphate-free form in yellow. The 1σ 2Fo - Fc density of the 3.1 Å phosphate-free model isshown in blue. In the 2.4 Å structure a phosphate lies in the precorrin bindingsite, but in it’s absence these two loops reorient to decrease the overall width ofthe active site, and replace the phosphate. The Asp 54 Cα moves 6 Å and theresidue flips 180º moving the side chain by 9 Å.

Page 6: The X-ray structure of a cobalamin biosynthetic enzyme, cobalt-precorrin-4 methyltransferase

articles

and catechol O-methyltransferase share a similar α/β tertiary foldand it was proposed that many (if not all) have a common catalyticdomain structure23. Despite the fact that CbiF also has an α/β basedbi-lobal architecture containing a parallel β-sheet, its overall topol-ogy and the manner in which it binds AdoHcy make it radically dif-ferent (Fig. 6a). The α/β structure of the domains can besuperimposed in two ways. Optimal topological superposition ofthe N-terminal domains using DALI24 results in the AdoMet bind-ing sites sitting at opposite ends of the molecule (Fig. 6b); Taq1(PDB code25 2ADM), r.m.s. deviation 3.6 Å over 69 residues: HhaI(PDB code 3MHT), r.m.s. deviation 2.7 Å over 60 residues. Analternate alignment, where the β-sheet is flipped by 180°, overlapsthe ligand binding sites but the topological similarity is even lower(Fig. 6b).

The most common class of protein domain structure is α/β andsuch domains can always be aligned to some extent. DALI24 sug-gests significant similarity of CbiF with 122 α/β proteins. Indeed,the best alignment is with the GTPase fragment of the signalsequence recognition protein from Thermus aquaticus, Ffh26 (PDBCode 1FFH), with an r.m.s. deviation of 3.0 Å over 79 residues,rather than with the HhaI, TaqI and catechol O-methyltransferases.Thus CbiF represents a new class of small molecule transmethylase.

590 nature structural biology • volume 5 number 7 • july 1998

A survey of the current protein data base25 indicates that thedomain folds are predominantly α/β for enzymes which use theAdoHcy and AdoMet ligands as substrates, but α-helical innature for proteins which use the ligands in regulatory func-tions27. The exception may be the activation domain of methion-ine synthase, a primarily helical domain that binds AdoMet. Thisdomain functions primarily to store AdoMet for periodic reacti-vation of the cobalamin cofactor and also functions to presentthe ligand for enzymatic turnover28.

Early thoughts on metabolic enzyme evolution propoundedthe idea that enzymes within a pathway may be structurally relat-ed. This was based on the concept that an enzyme already has anatural recognition pocket for its product and that slight modifi-cation of the enzyme through gene duplication would allow it toundertake a new reaction using its old product as substrate. Forsystems such as the glycolytic pathway, this has proved not to betrue. However, the corrin biosynthetic transmethylases requirere-evaluation of this concept. Here six members of a single path-way have indeed evolved from a single ancestor and, whileretaining the same overall fold, have incorporated small changesallowing recognition of different substrates at a number ofpoints along the pathway.

Fig. 6 CbiF topology and alignments to the DNA transmethylase HhaI. a, A topology diagram of CbiF is shown with the numbering of secondarystructural elements and conserved residues highlighted. The CbiF N-terminal domain is aligned with the DNA transmethylase HhaI in accordance tothe output of DALI (shown as colored segments). Additional secondary structural elements which are not used in the alignments but lie in similarthree-dimensional space are shown with hashed lines. The ligand binding sites sit at opposite ends of the aligned domains. b, A schematic diagramof the topological superposition between CbiF and HhaI, showing the two possible arrangements. The shaded ellipsoid represent the AdoMet/Hcyligand, triangles represent strands and circles represent helices.

a b

Page 7: The X-ray structure of a cobalamin biosynthetic enzyme, cobalt-precorrin-4 methyltransferase

articles

MethodsPurification and crystallization. CbiF waspurified as described14 by affinity chro-matography using a Pharmacia nickelchelating sepharose and an N-terminal His-tag. The pure protein was dialyzed against20 mM sodium acetate, pH 5.6, 100 mM NaCland concentrated to 16–20 mg ml–1 for crys-tallization. The protein was incubated with5 mM AdoMet prior to crystallization andthen mixed in equal volumes with 1.0–1.2 MNa/KPO4, 0.1 M HEPES, pH 7.5, and 4% diox-ane and equilibrated over the same solu-tion. Crystals do not form in the absence ofligand, though they do form in the presenceof AdoHcy. There is one molecule in theasymmetric unit giving a VM of 3.4 and 64%solvent. The crystals were shock frozen at120 K for X-ray data collection in a cryosol-vent containing the crystallization precipi-tants plus 30% glycerol.

Both native and heavy atom derivativedata were collected at the DaresburySynchrotron Radiation Source beamline 9.6at a wavelength of 0.91 Å. Data wereindexed and scaled with DENZO andSCALEPACK (Table 1)29. The heavy atomderivative was obtained by soaking nativeCbiF crystals in precipitant containing 1 mMmethyl-mercury chloride (MeHgCl) for onehour prior to data collection.

Structure determination. The structurewas solved by single isomorphous replace-ment with anomalous scattering (SIRAS) at 2.4 Å. The mercuryderivative was identified (Riso = 0.179) using the CCP4 suite of pro-grams and refined in MLPHARE30 (Table 1). The excellent resultantphases were further enhanced by solvent flattening31 using a sol-vent content of 70 % — a value above the calculated solvent per-centage. The high solvent content did not cause a disjointedprotein surface mask and was justified by the large increase in thefigure of merit from 0.402 (SIRAS) to 0.799 (flattened) (Table 1).The density modified maps enabled building 67 % of the residueswith the program ‘O’32. The full model was completed after oneround of refinement using X-PLOR (8–2.4 Å)33. Refinement wascompleted with REFMAC (resolution 20–2.4 Å) using a bulk solventcorrection34. The R-factor is 20.4% (RFree 25.2% for 5% of the data).The model contains 135 waters, two phosphate ions and anAdoHcy ligand. There are 278 residues in the recombinant protein,the first twenty contain six histidines and a thrombin cleavage site(M1GSSHHHHHH SSGLVPRGSH M21). The natural protein starts atresidue Met 21. Residues 13–251 are visible in the density, but theN-terminal His-tag and the C-terminal 27 residues are disordered.

The side chain of one residue, Asp 123, has been assigned analternate conformation with half site occupancy. The overall B-fac-tor for main chain atoms is 35.8 Å2 but residues at the N-terminus(13–17), loop β2-αB (53–57), loop β3-αC (73–80) and the C-terminalresidue (251) have B-factors of over 50 Å2. The high B-factors maybe a result of the loose crystal packing, high solvent content andthe lack of order of 39 terminal residues (14% of amino acids). Theoverall B-value estimated from the Wilson plot for the data is40.8 Å2. Eighty-eight percent of residues are in the most favoredregions of the Ramachandran plot35 with only one residue (Leu 56)in a generously allowed region as defined in PROCHECK36.

Thrombin cleavage leads to a different crystal form. The N-terminal His-tag was cleaved off CbiF using thrombin, after

nature structural biology • volume 5 number 7 • july 1998 591

overnight incubation at 30 ºC in 70 mM Tris, pH 8.5, 100 mM NaCland 2.5 mM CaCl2. The cleaved CbiF was purified by gel filtrationchromatography and concentrated in the storage buffer. Thecleaved protein did not crystallize in the high phosphate mediumused for the full length His-tagged molecule. However, crystals didgrow from 25% monomethylether polyethylene glycol (2,000 Mr),200 mM MgCl2 and 100 mM Tris buffer, pH 8.5. Data were collectedto 3.1 Å on a Rigaku Raxis imaging plate using a conventional cop-per rotating anode as X-ray source (λ = 1.54 Å) (Table 1).

A single unambiguous molecular replacement solution was deter-mined for the new crystal form using AMoRe37. This phosphate freestructure was refined to an R-factor of 19.3% and an Rfree of 28.3%at 3.1 Å. Residues 18–251 are visible in the density; no waters havebeen modeled. Several surface side chains and the loop containingresidues 161–175 have poor density.

Coordinates. The coordinates for both structures have beendeposited in the Brookhaven Protein Data Bank with the codes1CBF (His-tagged form at 2.4 Å) and 2CBF (second crystal form at3.1 Å with His-tag cleaved).

AcknowledgmentsWe gratefully acknowledge funding from the National Institutes of Health, theWellcome Trust and the Biotechnology and Biological Sciences Research Council.We thank the Central Laboratory of the Research Council and the staff of theDaresbury Laboratory for the provision of synchrotron radiation facilities and theBBSRC for support of such usage through the Rolling Project Mode Timeallocation to York.

Received by 23 February, 1998; accepted 27 May, 1998.

Table 1 Data Statistics

Data set Native MeHgCl CleavedSpace group P3121 P3121 P3221a =,b = 80.70 Å 80.58 Å 80.04 Åc = 109.58 Å 109.54 Å 77.96 ÅResolution range 20–2.4 Å 20–2.4 Å 20–3.1 ÅUnique reflections 16,620 16,375 5,523Completeness (%) 99.8 (100.0) 97.9 (91.0) 95.3 (93.7)

overall (final shell)Rmerge

1 (final shell) 0.039 (0.122) 0.051 (0.284) 0.103 (0.489)Phasing power2 (centric / acentric) 1.05 / 1.35FOM3: SIRAS / solvent flattened 0.402 / 0.799R-factor4 (Rfree) 0.204 (0.252) – 0.193 (0.283)Resolution range 20–2.4 Å 20–3.1 ÅRMS deviation from ideality

bonds (Å) 0.014 0.011angles (°) 2.3 2.3

Average B-factors (Å)2

main chain 35.8 34.1side chain 40.6 39.7solvent 49.0phosphate 45.3SAH 22.7 22.2

1Rmerge = ΣhklΣi|I-<I>I/ΣhklΣi(I)2Phasing power = (FH/Lack of closure)3FOM = Figure of merit: ((cosϕ)2+(sinϕ)2)1/2

4R-factorr = Σhkl||Fobs|-k|Fcalc||/Σhkl|Fobs|

Page 8: The X-ray structure of a cobalamin biosynthetic enzyme, cobalt-precorrin-4 methyltransferase

articles

1. Stubbe, J. Binding site revealed for Nature’s most beautiful cofactor. Science 266,1663 (1994).

2. Scott, I. How nature synthesizes vatamin B12 - a survey of the last four billionyears. Angew. Chem. 32, 1223-1376 (1993).

3. Battersby, A.R. How nature builds the pigments of life - The conquest of vitaminB12. Science 264, 1551-1557 (1994).

4. Blanche, F., et al. Vitamin B12 - How the problem of biosynthesis was solved.Angew. Chem. Int. Ed. Engl. 32, 1651-1653 (1995).

5. Debussche, L., Thibaut, D., Cameron, B., Crouzet, J. & Blanche, F. Biosynthesis ofthe corrin macrocycle of coenzyme-B12 in Pseudomonas denitrificans. J. Bacteriol.175, 7430-7440 (1993).

6. Lawerence, J.G. & Roth, J.R. The cobalamin (coenzyme B12) biosynthetic genes ofEscherichia coli. J. Bacteriol. 177, 6371-6380 (1995).

7. Roth, J.R., Lawerence, J.G., Rubenfield, M., Kieffer-Higgins, S. & Church, G.M.Characterization of the cobalamin (vitamin B12) biosynthetic genes of Salmonellatyphimurium. J. Bacteriol. 175, 3303–3316 (1993).

8. Jordan, P.M. Highlights in haem biosynthesis. Curr. Opin. Struc. Biol. 4, 902–911(1994).

9. Blanche, F., et al. Parallels and decisive differences in vitamin B12 biosyntheses.Angew. Chem. Int. Ed. Engl. 32, 1651–1653 (1993).

10. Raux, E., et al. Salmonella typhimurium cobalamin (vitamin B12) biosyntheticgenes: Functional studies in S. typhiumurium and Escherichia coli. J. Bacteriol.178, 753–767 (1996).

11. Raux, E., Thermes, C., Heathcote, P., Rambach, A. & Warren, M.J. A role for theSalmonella typhimurium cbiK in cobalamin (vitamin B12) and sirohemebiosynthesis. J. Bacteriol. 179, 3203–3212 (1997).

12. Scott, I.A., et al. Biosynthesis of vitamin B12: Factor IV, a new intermediate in theanarobic pathway. Proc. Natl. Acad. Sci. USA 93, 14316–14319 (1996).

13. Martin-Verstraete, I., Debarbouille, M., Klier, A. & Rapoport, G. Levanase operonof Bacillus subtilus includes a fructose-specific phosphotransferase systemregulating the expression of the operon. J. Mol. Biol. 214, 657–669 (1990).

14. Raux, E., Woodcock, S.C., Schubert, H.L., Wilson, K.S. & Warren, M.J. Cobalamin(vitamin B12) biosynthesis; Cloning, expression and crystallisation of the Bacillusmegaterium S-adenosyl-L-methionine dependent cobalt-precorrin-4transmethylase CbiF. Euro. J. Bacteriol. in the press (1998).

15. Oldfield, T.J. Real space refinement as a tool for model building. CCP4 StudyWeekend: Macromolecular refinement (Dodson, E.J., Moore, M.H., Ralph, A. &Bailey, S., eds.) 67–74 (SERC Daresbury Laboratory, Warrington, UK.;1996).

16. Malone, T., Blumenthal, R.M. & Cheng, X. Structure-guided analysis reveals ninesequence motifs conserved among DNA amino-methyl-transferases, and suggestsa catalytic mechanism for these enzymes. J. Mol. Biol. 253, 618–632 (1995).

17. Hodel, A.E., Gershon, P.D., Shi, X. & Quiocho, F.A. The 1.85 Å structure of Vacciniaprotein VP39: A bifunctional enzyme that participates in the modification ofboth mRNA ends. Cell 85, 247–256 (1996).

18. Blanche, F., Debussche, L., Thibaut, D., Crouzet, J. & Cameron, B. Purification andcharacterization of S-adenosyl-L-methionine: uroporphyrinogen III methyltransferasefrom Pseudomonas denitrificans. J. Bacteriol. 171, 4222–4231 (1989).

19. Drennan, C.L., Huang, S., Drummond, J.T., Matthews, R.G. & Ludwig, M.L. How aprotein binds B12: A 3.0 Å X-ray structure of B12-binding domains of methioninesynthase. Science 266, 1669–1674 (1994).

20. Roessner, C.A., et al. Expression of 9 Salmonella typhimurium enzymes forcobalamide synthesis. FEBS letters 301, 73–78 (1992).

592 nature structural biology • volume 5 number 7 • july 1998

21. Woodcock, S.C. & Warren, M.J. Evidence for a covalent intermediate in the S-adenosyl-L-methionine-dependent transmethylation reaction caused bysirohaem synthase. Biochem. J. 313, 415–421 (1996).

22. Woodcock, S.C., et al. The contribution of the CysGA and CysGB domains ofsiroheam synthase (CysG) towards cobalamin (vitamin B12) biosynthesis. Biochem.J. 330, 121–129 (1998).

23. Schluckebier, G., O’Gara, M., Saenger, W. & Cheng, X. Universal catalytic domainstructure of AdoMet-dependent methyltransferases. J. Mol. Biol. 247, 16–20 (1995).

24. Holm, L. & Sander, C. Protein structure comparison by alignment of distancematrices. J. Mol. Biol. 233, 123–138 (1993).

25. Bernstein, F. C., Koetzle, T. F., Williams, G. J. B., Meyer, E. J., Brice, M. D., Rogers, J.K., Kennard, O., Shimanouchi, T. & Tasumi, M. (1977). The protein data bank: acomputer-based archival file for macromolecular structures. J. Mol. Biol. 112,535–542.

26. Freyman, D. M., Keenan, R. J., Stoud, R. M. & Walter, P. The structure of theconserved GTPase domain of the signal recognition particle. Nature 385,361–365 (1997).

27. Orengo, C.A., Michie, A.D., Jones, S., Jones, D.T., Swindells, M.B. & Thornton, J.M.CATH- a hierarchic classification of protein domain structures. Structure 5,1093–1108 (1997).

28. Dixon, M.M., Huang, S., Matthews, R.G. & Ludwig, M. The structure of the C-terminal domain of methionine synthase: presenting S-adenosylmethionine forreductive methylastion of B12. Structure 4, 1263–1275 (1996).

29. Otwinowski, Z. processing of X-ray diffraction data collected in ossilation mode.Meth. Enz. 276, 307–326 (1991).

30. Otwinowski, Z. Maximum likelihood refinement of heavy atom parameters.Proceedings of the CCP4 Study Weekend (Wolf, W., Evans, P.R. & Leslie, A.G.W.,eds) 80-88 (SERC Daresbury Laboratory, Warrington, UK; 1991).

31. Cowtan, K. in CCP4 & ESF-EACBM Newsletter on Protein Crystallography 34-38(1994).

32. Jones, T.A., Zou, J.Y., Cowan, S.W. & Kjelgaard, M. Improved methods forbuilding protein models in electron density maps and location of errors in thesemodels. Acta Crystallogr. A 47, 110-119 (1991).

33. Brunger, A.T. X-PLOR Version 3.1: A system for X-ray Crystallography and NMR(Yale University Press, New Haven, Connecticut, USA; 1992).

34. Murshudov, G.N., Vagin, A.A. & Dodson, E.J. Refinement of macromolecularstructures by the maximum likelihood method. Acta Crystallogr. D 53, 240-255(1997).

35. Ramachandran, S. Conformations of polypeptides and proteins. Adv. Prot. Chem.23, 283-437 (1968).

36. Laskowski, R.A., MacAuthur, M.W., Moss, D.S. & Thornton, J.M. PROCHECK - aprogram to check the sterochemical quality of protein structures. J. Appl.Crystallogr. 26, 283-291 (1993).

37. Navaza, J. AMORE - an automated package for molecular replacement. ActaCrystallogr. A 50, 157-163 (1994).

38. Esnouf, R.M. An extensively modified version of MolScript that includes greatlyenhanced coloring capabilities. J. Mol. Graph. 15, 133-138 (1997).

39. Kraulis, P.J. MOLSCRIPT - a program to produce both detailed and schematic plotsof proteins structures. J. Appl. Crystallogr. 24, 946-950 (1991).

40. Nicholls, A., Sharp, K.A. & Honig, B. Protein folding and association: insights fromthe interfacial and thermodynamic properties of hydrocarbons. Proteins 11, 281-296 (1991).