5
Proc. Natl. Acad. Sci. USA Vol. 92, pp. 7714-7718, August 1995 Biochemistry Eukaryotic methionyl aminopeptidases: Two classes of cobalt-dependent enzymes (cotranslational processing/initiation of protein synthesis/evolution) STUART M. ARFIN*, RICHARD L. KENDALL*t, LINDA HALL*, LARRY H. WEAVERt, ALBERT E. STEWART*II BRLAN W. MATrHEWSt, AND RALPH A. BRADSHAW* *Department of Biological Chemistry, College of Medicine, University of California, Irvine, CA 92717-1700; and *Institute of Molecular Biology, Howard Hughes Medical Institute and Department of Physics, University of Oregon, Eugene, OR 97403 Contributed by Brian W Matthews, May 24, 1995 ABSTRACT Using partial amino acid sequence data de- rived from porcine methionyl aminopeptidase (MetAP; me- thionine aminopeptidase, peptidase M; EC 3.4.11.18), a full- length clone of the homologous human enzyme has been obtained. The cDNA sequence contains 2569 nt with a single open reading frame corresponding to a protein of 478 amino acids. The C-terminal portion representing the catalytic do- main shows limited identity with MetAP sequences from various prokaryotes and yeast, while the N terminus is rich in charged amino acids, including extended strings of basic and acidic residues. These highly polar stretches likely result in the spuriously high observed molecular mass (67 kDa). This cDNA sequence is highly similar to a rat protein, termed p67, which was identified as an inhibitor of phosphorylation of initiation factor eIF2a and was previously predicted to be a metallopeptidase based on limited sequence homology. Model building established that human MetAP (p67) could be readily accommodated into the Escherichia coli MetAP struc- ture and that the Co2+ ligands were fully preserved. However, human MetAP was found to be much more similar to a yeast open reading frame that differed markedly from the previ- ously reported yeast MetAP. A similar partial sequence from Methanothermusfervidus suggests that this p67-like sequence is also found in prokaryotes. These findings suggest that there are two cobalt-dependent MetAP families, presently composed of the prokaryote and yeast sequences (and represented by the E. coli structure) (type I), on the one hand, and by human MetAP, the yeast open reading frame, and the partial pro- karyotic sequence (type II), on the other. Protein synthesis is generally initiated with methionine (in eukaryotes) or N-formylmethionine (in prokaryotes). How- ever, the products of this process are subject to one or more N-terminal modification reactions, occurring co- or posttrans- lationally, that often result in mature proteins with a variety of free (or N"-acylated) N-terminal amino acid residues other than methionine (1-3). Although the physiological importance of these modifications is incompletely understood, they likely operate in functional regulation, intracellular targeting, and protein turnover. Removal of the initiator methionine of nascent polypeptide chains (when it occurs) is catalyzed by a methionyl aminopep- tidase (MetAP; methionine aminopeptidase, peptidase M; EC 3.4.11.18). In bacteria, the N-formyl group of the initiator N-formylmethionine is first removed by an appropriate de- formylase. Analyses of the protein data base (2, 4-6) and studies of the processing of mutant proteins (6-8) or proteins with systematically modified N-terminal sequences (9, 10) suggest that MetAPs from diverse sources have quite similar substrate specificities. In addition, purified enzymes from bacteria (11, 12), yeast (13), and porcine liver (14) display similar substrate specificities that are largely consistent with that predicted from these analyses. In all cases, the penultimate amino acid is apparently the primary determinant of specific- ity; cleavage occurs when the penultimate residue is small (G,A,S,C,T,P,V) and not when it is large (D,E,N,Q,K,H,R, L,I,M,F,Y,W). The genes encoding the Escherichia coli (11), Salmonella typhimurium (12), Bacillus subtilis (15), and Saccharomyces cerevisiae (16) MetAPs have been cloned and sequenced. The prokaryotic enzymes are similar in size ("30 kDa), whereas the yeast enzyme contains a 125-amino acid N-terminal ex- tension with sequences resembling zinc-finger domains. The remainder of the sequence shares "40% amino acid identity with the bacterial enzymes. All MetAPs studied to date are activated by Co2+ and appear to require this ion for activity. The x-ray structure of E. coli MetAP has been determined to 2.4-A resolution (17). The enzyme contains a distinctive fold that displays internal two-fold symmetry of the first and second halves of the sequence and appears to represent another class of proteolytic enzyme. Protein data base searches using both sequence- and structure-based profiles identified several other proteins likely to share the same fold (18). These proteins included creatinase (19), aminopeptidase P (20, 21), proline dipeptidase (22, 23), and p67 (24, 25), a eukaryotic initiation factor 2 (eIF2)-associated protein proposed to regulate protein synthesis by protecting the eIF2 a subunit from phosphoryla- tion by eIF2 kinases. With a panel of substrates that reflect the precise specificity associated with cotranslational processing, a porcine liver MetAP was isolated and characterized (14). The enzyme has an apparent Mr of 67,000, is monomeric, and requires Co2+ ions for activity. Using partial sequence data obtained from endoprotease Lys-C fragments, we report here the sequence of a full-length clone of the homologous human protein. The deduced amino acid sequence of the human protein is highly similar to the rat p67 protein (eIF2-associated protein). Model building suggests that substantial segments of the C-terminal domain are fully compatible with the three-dimensional struc- ture of E. coli MetAP, confirming the prediction (18) that p67 may function as a metal-dependent aminopeptidase. The subsequent identification of a yeast open reading frame (ORF) coding for a putative protein 55% identical to human MetAP (p67) (although lacking most of the N-terminal extension) (26) Abbreviations: MetAP, methionine aminopeptidase; RACE, rapid amplification of 5' cDNA ends; ORF, open reading frame. tPresent address: Department of Biochemistry, Merck Research Lab- oratories, Rahway, NJ 07065-0900. §Present address: Imperial Cancer Research Fund, Structural Molec- ular Biology Unit, Department of Crystallography, Birkbeck College, Malet Street, London WCIE 7HX, United Kingdom. ITo whom reprint requests should be addressed. liThe sequence reported in this paper has been deposited in the GenBank data base (accession no. U29607). 7714 The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. Downloaded by guest on December 12, 2021

Eukaryotic methionyl aminopeptidases: Twoclasses of cobalt

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Proc. Natl. Acad. Sci. USAVol. 92, pp. 7714-7718, August 1995Biochemistry

Eukaryotic methionyl aminopeptidases: Two classes ofcobalt-dependent enzymes

(cotranslational processing/initiation of protein synthesis/evolution)

STUART M. ARFIN*, RICHARD L. KENDALL*t, LINDA HALL*, LARRY H. WEAVERt, ALBERT E. STEWART*IIBRLAN W. MATrHEWSt, AND RALPH A. BRADSHAW**Department of Biological Chemistry, College of Medicine, University of California, Irvine, CA 92717-1700; and *Institute of Molecular Biology,Howard Hughes Medical Institute and Department of Physics, University of Oregon, Eugene, OR 97403

Contributed by Brian W Matthews, May 24, 1995

ABSTRACT Using partial amino acid sequence data de-rived from porcine methionyl aminopeptidase (MetAP; me-thionine aminopeptidase, peptidase M; EC 3.4.11.18), a full-length clone of the homologous human enzyme has beenobtained. The cDNA sequence contains 2569 nt with a singleopen reading frame corresponding to a protein of 478 aminoacids. The C-terminal portion representing the catalytic do-main shows limited identity with MetAP sequences fromvarious prokaryotes and yeast, while the N terminus is rich incharged amino acids, including extended strings of basic andacidic residues. These highly polar stretches likely result in thespuriously high observed molecular mass (67 kDa). ThiscDNA sequence is highly similar to a rat protein, termed p67,which was identified as an inhibitor of phosphorylation ofinitiation factor eIF2a and was previously predicted to be ametallopeptidase based on limited sequence homology. Modelbuilding established that human MetAP (p67) could bereadily accommodated into the Escherichia coli MetAP struc-ture and that the Co2+ ligands were fully preserved. However,human MetAP was found to be much more similar to a yeastopen reading frame that differed markedly from the previ-ously reported yeast MetAP. A similar partial sequence fromMethanothermusfervidus suggests that this p67-like sequenceis also found in prokaryotes. These findings suggest that thereare two cobalt-dependent MetAP families, presently composedofthe prokaryote and yeast sequences (and represented by theE. coli structure) (type I), on the one hand, and by humanMetAP, the yeast open reading frame, and the partial pro-karyotic sequence (type II), on the other.

Protein synthesis is generally initiated with methionine (ineukaryotes) or N-formylmethionine (in prokaryotes). How-ever, the products of this process are subject to one or moreN-terminal modification reactions, occurring co- or posttrans-lationally, that often result in mature proteins with a variety offree (or N"-acylated) N-terminal amino acid residues otherthan methionine (1-3). Although the physiological importanceof these modifications is incompletely understood, they likelyoperate in functional regulation, intracellular targeting, andprotein turnover.Removal of the initiator methionine of nascent polypeptide

chains (when it occurs) is catalyzed by a methionyl aminopep-tidase (MetAP; methionine aminopeptidase, peptidase M; EC3.4.11.18). In bacteria, the N-formyl group of the initiatorN-formylmethionine is first removed by an appropriate de-formylase. Analyses of the protein data base (2, 4-6) andstudies of the processing of mutant proteins (6-8) or proteinswith systematically modified N-terminal sequences (9, 10)suggest that MetAPs from diverse sources have quite similarsubstrate specificities. In addition, purified enzymes from

bacteria (11, 12), yeast (13), and porcine liver (14) displaysimilar substrate specificities that are largely consistent withthat predicted from these analyses. In all cases, the penultimateamino acid is apparently the primary determinant of specific-ity; cleavage occurs when the penultimate residue is small(G,A,S,C,T,P,V) and not when it is large (D,E,N,Q,K,H,R,L,I,M,F,Y,W).The genes encoding the Escherichia coli (11), Salmonella

typhimurium (12), Bacillus subtilis (15), and Saccharomycescerevisiae (16) MetAPs have been cloned and sequenced. Theprokaryotic enzymes are similar in size ("30 kDa), whereasthe yeast enzyme contains a 125-amino acid N-terminal ex-tension with sequences resembling zinc-finger domains. Theremainder of the sequence shares "40% amino acid identitywith the bacterial enzymes. All MetAPs studied to date areactivated by Co2+ and appear to require this ion for activity.The x-ray structure of E. coli MetAP has been determined to2.4-A resolution (17). The enzyme contains a distinctive foldthat displays internal two-fold symmetry of the first and secondhalves of the sequence and appears to represent another classof proteolytic enzyme. Protein data base searches using bothsequence- and structure-based profiles identified several otherproteins likely to share the same fold (18). These proteinsincluded creatinase (19), aminopeptidase P (20, 21), prolinedipeptidase (22, 23), and p67 (24, 25), a eukaryotic initiationfactor 2 (eIF2)-associated protein proposed to regulate proteinsynthesis by protecting the eIF2 a subunit from phosphoryla-tion by eIF2 kinases.With a panel of substrates that reflect the precise specificity

associated with cotranslational processing, a porcine liverMetAP was isolated and characterized (14). The enzyme hasan apparent Mr of 67,000, is monomeric, and requires Co2+ions for activity. Using partial sequence data obtained fromendoprotease Lys-C fragments, we report here the sequence ofa full-length clone of the homologous human protein. Thededuced amino acid sequence of the human protein is highlysimilar to the rat p67 protein (eIF2-associated protein). Modelbuilding suggests that substantial segments of the C-terminaldomain are fully compatible with the three-dimensional struc-ture of E. coli MetAP, confirming the prediction (18) that p67may function as a metal-dependent aminopeptidase. Thesubsequent identification of a yeast open reading frame (ORF)coding for a putative protein 55% identical to human MetAP(p67) (although lacking most of the N-terminal extension) (26)

Abbreviations: MetAP, methionine aminopeptidase; RACE, rapidamplification of 5' cDNA ends; ORF, open reading frame.tPresent address: Department of Biochemistry, Merck Research Lab-oratories, Rahway, NJ 07065-0900.§Present address: Imperial Cancer Research Fund, Structural Molec-ular Biology Unit, Department of Crystallography, Birkbeck College,Malet Street, London WCIE 7HX, United Kingdom.ITo whom reprint requests should be addressed.liThe sequence reported in this paper has been deposited in theGenBank data base (accession no. U29607).

7714

The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertisement" inaccordance with 18 U.S.C. §1734 solely to indicate this fact.

Dow

nloa

ded

by g

uest

on

Dec

embe

r 12

, 202

1

Proc. Natl. Acad. Sci. USA 92 (1995) 7715

supports the view that this protein (and the yeast homolog)represent a second class of MetAPs that are distantly relatedto the previously described prokaryote/yeast enzymes andshare specificity, metal requirements, and general backbonefold (in the catalytic domain).

MATERIALS AND METHODSDetermination of Partial Amino Acid Sequences. For pep-

tide analyses, 200 jig of pure porcine MetAP was incubatedwith endoprotease Lys-C (Boehringer Mannheim) for 6 hr at37°C. The resulting peptides were separated by HPLC on anAquapore C8 column. Peptides were eluted using a gradientfrom 10 to 70% (vol/vol) acetonitrile in 0.1% trifluoroaceticacid (Applied Biosystems model 140A). The column effluentwas monitored at 212 nm with an ISCO model CV4 capillaryelectrophoresis detector. Selected peaks were analyzed byautomated liquid pulse sequencing on an Applied Biosystemsmodel 477A sequencer equipped with an online phenylthio-hydantoin analyzer. The N-terminal sequences offour peptideswere determined: peptide 1, TVPIV; peptide 2, GGEATRM-EEGEVYAIETFG; peptide 3, NLXDLGIVDPYPPL; andpeptide 4, KALDQASEEIWNDFREAAEAXXQV (see alsoFig. 1).cDNA Cloning ofPorcine and Human MetAP. Pairs of sense

and antisense degenerate oligonucleotides corresponding toregions of peptides 2 and 4 were synthesized by using phos-phoramidite chemistry on an Applied Biosystems model 391Ainstrument and used in all combinations for PCR amplificationof oligo(dT)-primed first-strand cDNA synthesized from tem-plate porcine liver RNA. The combination of the sense oligo-nucleotide from peptide 2, 5'-ATG GAR GAR GGN GARGTN TT-3' and the antisense oligonucleotide 5'-GG RTANGG RTCNCADAT NCC-3' (whereD is T, G, orA) from pep-tide 4 yielded a 273-bp product that was purified and ligatedinto the pCRII vector (TA cloning kit, Invitrogen). Thetranslated nucleotide sequence contained the C-terminal se-quence of peptide 2 and the N-terminal sequence of peptide4. The PCR fragment was 32P-labeled by the random-primermethod and used to probe a porcine liver AgtlO cDNA library(Clontech). Forty positive clones were obtained from screen-ing 1.5 x 106 recombinant phage. A 2.1-kb clone, p71A1, wassubsequently used to screen a cDNA library prepared fromhuman fetal brain poly(A)-selected RNA in the AZAP IIvector (Stratagene). Sequencing of the human cDNA clone(pMB) with the largest insert (-2.4 kb) revealed that this clonedid not contain an in-frame stop codon upstream of the firstAUG. An internal primer and a library-specific primer werethen used to amplify the 5' ends of all positive clones. None ofthese clones proved to be longer at their 5' ends than clonepMB. Therefore, rapid amplification of 5' cDNA ends(RACE) (27) was done with first-strand cDNA prepared byreverse transcription of human HepG cell RNA as template todetermine the 5' sequences. The product was cloned intopBluescript II. Five independent clones were isolated andsequenced.

MODEL BUILDINGUsing the known crystal structure of E. coli MetAP (17) as areference, the model-building program FRODO (28) was usedto determine whether the sequence of the human enzyme,aligned as in Fig. 2, was compatible with a similar three-dimensional fold. The backbone of the protein was keptconstant, and the side chains corresponding to the humanMetAP sequence were built to have standard geometry and, sofar as possible, to avoid steric interference with neighboringparts of the structure. Particular attention was given to sub-stitutions involving glycine and proline because these residues

can be associated with specific conformations of the polypep-tide backbone.

RESULTSHuman MetAP cDNA and Predicted Protein Sequence.

Using the highly selective, nonprocessive activity attributed tothe N-terminal processing of the initiator methionine residueon nascent chains during eukaryotic protein synthesis, a 67-kDa protein was isolated and characterized from porcine liver(14). The enzyme is monomeric and is strongly stimulated bycobalt ions, similar to other MetAPs that have been identified(11-13). Treatment of 200 jig of porcine MetAP with endo-protease Lys-C, followed by fractionation of the digest byHPLC and automated peptide sequencing, yielded four pep-tide sequences (see Materials and Methods and Fig. 1). Thesedata were used to generate oligonucleotides for PCR, yieldinga fragment that was suitable for the identification of a partialporcine clone. This sequence was used to identify a nearlycomplete human cDNA sequence (pMB).The 5' sequence wasgenerated by the RACE procedure (27).

Fig. 1 shows the nucleotide sequence of the putative full-length cDNA of human MetAP, reconstituted from the RACEand pMB sequences, and the predicted amino acid sequence.The cDNA sequence contains a total of 2569 nt and a singleORF extending from nt 1 to nt 1439. There is no poly(A) tail,although several possible polyadenylylation signals can befound in the 3' untranslated region. The 5' end of the cDNAlies within a (GC)-rich region, suggesting that it is a GpC islandand probably the start of transcription. Although there is noin-frame stop codon preceding the first ATG, the nucleotidesequence surrounding it agrees well with the consensus se-quence proposed for translation initiation in higher eukaryotes(29).The ORF encodes a protein of 478 amino acids with a

calculated molecular mass of 52,832 Da. The N-terminalportion of MetAP is rich in charged amino acids, 53 of the first110 residues consisting of arginine, lysine, glutamic, andaspartic acids. These residues include runs of 9 (residues36-44) and 10 (residues 98-107) basic amino acids and a runof 11 (residues 77-87) charged amino acids, of which 9 areacidic and 2 are basic. Comparison of the deduced amino acidsequence to that of the MetAPs from E. coli (11), S. typhi-murium (12), B. subtilis (15), and S. cerevisiae (13) shows thatthe yeast and human enzymes have N-terminal extensions of125 and 160 amino acids, respectively, relative to the prokary-otic forms (Fig. 2). There is no obvious sequence similaritybetween the two extensions. In the portion of the proteinshared by human MetAP and the yeast and prokaryoticenzymes, there is 20-24% sequence identity. In contrast, theyeast enzyme is 40% identical to the bacterial enzymes in thissame region.The substantial sequence differences, as well as the overall

organizational differences distinguishing the human MetAPenzyme and the characterized MetAPs, raised the question ofwhether the human MetAP was truly homologous to thisgroup, particularly because the extent of sequence similaritywas marginal. Accordingly, the human MetAP sequence wasfitted to the E. coli structure to ascertain the extent to whichit could (or could not) be accommodated.

Significantly, none of the deletions or insertions occurwithin a-helices or (3-sheets. Rather, they are located nearresidues 58, 84, 117, 163, 169, 176, and 220 of the E. colisequence, all of which are within irregular or loop regions onthe surface of the E. coli MetAP molecule, where insertionsand deletions might be anticipated. Model building also sug-gests that the changes in sequence that occur within a-helicesand }3-sheets can be accommodated with only minor pertur-bations of the packing of these secondary-structure elements.This hypothesis is illustrated by the region surrounding resi-

Biochemistry: Arfin et aL

Dow

nloa

ded

by g

uest

on

Dec

embe

r 12

, 202

1

7716 Biochemistry: Arfin et al.

-22

99

Proc. Natl. Acad. Sci. USA 92 (1995)

M A G V E E V A A S G S H L N G D L D P D D R E E G A A S T A E E 33A S S F G R T S

_ _ S ~~~~~~~~~CTGAiGGACGAAGATAAAAATCAGGACCAGGTAAGTAGCAGCGCGAAAGA A K K K R R K K K K S K G P S A A G E Q E P D K E S G A S V D E V A R Q L E R 73

G A V S Q L T K

219 AT _G; ATGGCGATGGAGAT GGY3C CAAACS A L E D K E R D E D D E D G D G D G D G A T G K K K K K K K K K R G P K V Q T 113Q E K D A R

339 AGAT G ATGrGACT TGGAGGAD P P S V P I C D L Y P N G V F P K G Q E C E Y P P T Q D G R T A A W R T T S E 153

459 PC TTAGATCAGGCAAGrCAGAGATTTGGAATGATTTAGCCAGGGATCACAATE K K A L D Q A S E E I W N D F R E A A E A H R Q V R K Y V M S W I K P G M T M 193

579

699 T

I E I C E K L E D C S R K L I K E N G L N A G L A F P T G C S L N N C A A H Y T 233

PAATN A G D T T V L_ Y GCCGGKI F THACAGICAACATTAVTCFNPATYAP N A G D T T V L Q Y D D I C K I D F G T H I S G R I I D C A F T V T F N P K Y 2-73

939 _____ IGCATTGAAAGGE I D G K T Y Q V K P I R N L N G H S I G Q Y R I H A G K T V P I V K G G E A T 353

P

1059

1179

1299

A R G MI _ _CA7GAACTTGrGTr-CGAAAGTGQT rrAGATGAATAWTGGATTTCCArATTACATGAAAATmATTTTGAT&GACATGrGCCR M E E G E V Y A I E T F G S T G K G V V H D D M E C S H Y M K N F D V G H V P 393

AAA AGGCTTCMAGATAAC'G'AAGTCCCATCCIY; CCiACCCTGCCTTGGCC,APTGCGATCGCTTCCiGAAAGrAAAW GATGGCrCTGAAI R L P R T K H L L N V I N E N F G T L A F C R R W L D R L G E S K Y L M A L K 433

GAATCTGrGTATTG TCCATATCCACCATTATGTGTATACAGCGA CATACCATCCTGTTGCGTCCAATGrGMALGCAGN L C D L G I V D P Y P P L C D I K G S Y T A Q F E H T I L L R P T C K E V V S 473

C A Q P V K K L S

1419 TCGGCGAGTTTGTTGGAAAACATGATA(XAT;AATTAATTTGCCACATG rTGTarGTTTTTAAGTGGACR G D D Y * 478

A E E M T I K T

1539 CCATGTAATACTTTTT1659 C AATCTAAGAG_TGGKTATTT GAATGCTTT1779 _GECTGC CCT GGAATCATTTTCTAAAGATCT ACTATTC_C1899 A CATATGAAGrAAAGCr2019 A2139 TACCAGAATCCTA2259 A CTTCAATGM23792499cAAA.r A A

FIG. 1. Nucleotide sequence and deduced amino acid sequence of human MetAP (p67). Differences in the predicted rat liver p67 amino acidsequence are indicated below the human MetAP sequence. Sequences corresponding to the endoprotease Lys C-derived peptides are underlined.

dues 240-250 (E. coli MetAP numbering) (Fig. 3). In thisstereo projection, the model-built residues are shown withopen bonds, and the side chains of the E. coli MetAP modelare drawn in black circles. The backbone is retained exactly asin the E. coli structure. Clearly, the backbone conformation atAsp-242 is perfectly set up to accept the proline at this positionin the human sequence. Gly-244 of the E. coli sequence can bereplaced with the larger cysteine in the human moleculewithout introducing serious steric clashes. Cys-245 in the E. colienzyme is largely buried, but it appears that a lysine at thisposition, as in the human, could extend toward the surface, sothat the e-amino group might reach the solvent. Thr-241 is on

the surface, so that the arginine at this position can extend intothe solvent.

Additional confirmation that human MetAP (p67) and theE. coli enzyme adopt similar conformations is provided by theretention of the Co2+ ligands (defined in the E. coli structure)(17). As shown in Fig. 2, these motifs are fully conserved forthe human and E. coli MetAPs and, with the exception of asingle Gln -- Glu replacement, for the yeast MetAP as well.The requirement of the human enzyme for Co2+ for activityhas been established (14).Comparison to Other Sequences. A search of the GenBank

data base revealed that human MetAP shared 92% sequence

S.c.I 1 M S T T T T V T T S D Q A S H P t K I Y CSGLQCGRH.. 1 H A G V E E V A A S G S H L N G D L D P D D R E E G A A S T A E E A A K K K R R K K K K S K G P S A A G E Q E P D K E S

S.c.I 30 ET SQHMCPVCLkQGiVSIFC TSCYe NNYKAHK LHNA OGLEGAYD FPKFKYSGKVKH.s. 61 G A S V D E V A R Q L E R S A L K IE RD E D D E D G E G 0 G 0 G A T G K K K K K K K K K R G P K V Q T D P P S V P I

E.C. 1 M i S I K t P D I E KM V GkLAAE LS.c.I 90 A YPLTPRRYVPEdIPK DWAANGLPVSEQRNdRLNNiPIYKKdQIKKI K CMLG E LH.S. 121 C D L Y P N G V F P K G Q E C EY P P T Q D G R T A A W K T T S E E K K A L D Q A S E E I W N D F R E A A E A H R Q V RS.c.I 1 H H X Q d f N L Q R T T V E E S R Y L K R d L E r E H W N D V R K G A e I H R RV R RA I K DM.f. 1 MEK kK GKI A SK

E.c. 26 E M i E P Y :: TG 1 D R I C N Y I V N E Q H A V S A C L G Y Y P K S V C I i E V V C I D-S. c. I 150 D I A A AHr T -D IvHNEt I R G A Y - P S P L N Y Y N F P K S L C T v E V I C G VH. s. 181 K Y V HS W I P GX T HM Ik - I CEKLEDCSRKLIKEN - - - GLNAGLAFPTGCS L NNC A A H Y T P N AS.c.II 51 IVPGHK1MDIAD N T T R Y T GA E N L AM D- P K S . L .H fM.f. 16 K A iK A v G E K i 1 d -lA F i N E I E - mG KIP C,I '-:IT 5 PC

E.c. 84- A1 G VN V T V I KD F H G T S K HF I V GK - P T I KG e R C r I T Q e S L Y L A 1 r M V KS.c.I 206 K E G V V S L Y Y Q Y HA L N E Y Y V G E2 I S K E A L N T T E T S reCLKLA M C KH. s. 237 D DTYLQICXIDIF1THISGRIIIICA FTVT F N - - - - P K Y D I LLKAVKDAT.NTGIK X C A GS.C.I 100oe K v y . QN N. S s - - -- N A. YEM.f. 66 N D RK iL P G 1 V I V vO1):F GG T T LV E ----G Y E KN Y N D E L EKNKKMI E A

E.c. 142 PGi N R.e'.l::: A: K Fv AEGF-SVVrE- C G RG - F E E P Q L H Y DS.c.I 265 P.G.T.T FQe :d H E.K HA T ENKKC-SVVr- C b v E f - F C S P N i H Y AH.s. 293 I DCVLCD V.O E A I Q Vv M E IY E V E.I D 0 K T X Q V - - - - K P I R N L N G H S I G Q Y R I H A GK TV P I V KS.c.II 164 :' .: N ----........ .. .C C A P aSM. f. 123 A e s A E N A I N T }R dO v 'I G K i G E V I E N T I N K F G F S T _ t D R W V 1 S L s i N

E.c. 188 S R E T N V V 1 Tf P H V N A K E i R T M K d - -

S.c.I 311 K N R T P G Y.: . DMTPH T PHIN- TWK DMTWPdM.S. 349 G -FGEASRMI-.OT ..Y.YA I KT.F'G.S.TKV.VHS D.DI.ME C S H Y M K N F D V G HV P IRLPRT K H L L NV I NS.c220 N - dEL .rY:T A r S A e DHQ H P T D S A N K T DH. f. 183 - Q N S H k 1...

E.c. 220- G W T v K T K DR LS y vv T D NS.c.I 343-- D T S T T QD K L s .1 v T ESH.s. 408 EINFFT L A. F C KR WN L D R L o EIS K Y L: M.-AL .K N L C:D L G I OVD P Y P P L C D I K G S Y T A Q F EH T I LL R P TSoeII 279K QI~E RH 100 N P HANHE.c. 244 G C i 1 t L R K D T I P A I I S H D E *S.c.I 367 G V i 1 t A K N K K S P G G P RQ R I K *H. s. 468 CKEV..R0DDX:

33 ..:....--...-..:..::...:.-......:-..:k.-..:::..... - *S. .II 339 K X *..

FIG. 2. Primary sequence alignment of E. coli MetAP (E.c.), S. cerevisiae MetAP-1 (S.c. I), human MetAP (H.s.), S. cerevisiae hypotheticalprotein YBLO91c (S.c. II), and a partial ORF from Methanothermusfervidus (M.f.). Blank spaces indicate that the amino acid residues are identicalto those in the human sequence; dashes indicate deletions introduced to maximize alignments. The three dots at the end of the M. fervidus sequenceindicate that it is derived from a partial clone and that the sequence extends beyond this point for an unknown distance. Lowercase letters denoteconservative substitutions. Residues that are cobalt ligands in the structure of E. coli MetAP and are conserved in the other sequences are boxed.Regions of substantial homology between human MetAP and one or more of the other sequences are indicated by shading.

GATACIAAAT

Dow

nloa

ded

by g

uest

on

Dec

embe

r 12

, 202

1

Proc. Natl. Acad. Sci. USA 92 (1995) 7717

FIG. 3. Stereo drawing illustrating the compatibility of the amino acid sequence of human MetAP with the three-dimensional structure of E.coli MetAP (17). The flgure shows the results of model building in which residues Thr-461-Ser-473 of the human sequence (open bonds) weremodel-built onto the backbone of the E. coli structure (solid bonds). [This is the region in which the alignment of the human vs. E. coli MetAPsequence has been revised relative to that previously proposed for the rat P67 sequence (18) (see text).] Numbering corresponds to the E. colisequence. D242P, for example, indicates that Asp-242 in the E. coli sequence is replaced by a proline in the human enzyme. 0, Carbon atoms; 0,oxygen atoms. To show the overall context within the three-dimensional E. coli MetAP structure, neighboring residues are shown in solid blackand are unlabeled.

identity at the amino acid level with that previously reportedfor a rat liver eIF2-associated protein designated p67 (Fig. 1).p67 copurifies with eIF2 and is postulated to regulate proteinsynthesis by protecting the eIF2 a subunit against phosphor-ylation. The amino acid differences occur almost exclusively inthe N-terminal region and at the C-terminal end. The observedN-terminal differences are probably due to species variations(rat vs. human). However, at the C terminus the p67 sequencecan be transformed into the MetAP sequence by a + 1 frame-shifting over 18 codons. The insertion of an extra nucleotide atposition 1412 in the MetAP sequence is compensated for by asingle-base deletion 50 nt further downstream. Interestingly,this revised sequence markedly improved alignment withprokaryotic MetAPs in this region (Fig. 2). Identity of thehuman MetAP and rat liver p67 proteins (excepting speciesvariations) was, in fact, presaged by the recent suggestion thatp67 belongs to a small family of structurally related proteinsthat includes the MetAPs of E. coli, S. typhimurium, B. subtilis,and S. cerevisiae and the prediction that it would function as ametal-dependent aminopeptidase (18).

DISCUSSIONDetermination of the human MetAP sequence based onpeptides derived from the homogeneous enzyme (14) estab-lished the predicted link between eIF2-associated protein(p67) and MetAP (18). The identity of these proteins is furtherestablished by their unusual behavior on SDS gels. Bothproteins show apparentMr values of 67,000 but have calculatedmolecular weights in the low 50,000 range. Datta et al. (30)attributed this difference to intracellular 0-linked glycosyla-tion (31), but because these modifications introduce only asingle monosaccharide per site, a rather substantial number ofsites would be required to account for the difference. A morelikely explanation is the observation that elongated tracts ofacidic and basic residues can produce anomalously high Mrvalues on SDS gels (32, 33). Yeast Fpr3 (also called yFKBP-47)is a protein of Mr 46,800 (calculated) but shows an apparentmass of 65 kDa on SDS gels; this yeast protein is characterizedby similar stretches of basic and acidic residues (32).Although it is not unusual for a protein to display more than

one activity, the association of MetAP with the initiationcomplex rather than the ribosome was unexpected. Cotrans-lational excision of the initiator methionine occurs when thenascent chain is still quite short (20-40 residues) and presum-

ably in conjunction with Na-acetylation (3). Perhaps thisapparent association with the initiation complex represents themechanism by which MetAP in higher eukaryotes is recruitedto the ribosome to insure stoichiometric interactions that leadto the highly selective (and apparently quantitative) removal ofmethionine from susceptible sequences (9).Why the human enzyme has a significantly different orga-

nization than that of the yeast MetAP, including two majorinserts (one of 63 amino acids) in the catalytic domain (asindicated by the prokaryotic forms) and an entirely differentN-terminal extension, is also unclear. The N-terminal se-quence of the yeast MetAP has two putative zinc-fingerdomains that may allow interactions with nucleic acid (13). Theexcision of these domains has little effect on the catalyticproperties of the enzyme, but the construct expressing thistruncated form is significantly less effective in rescuing theslow-growth phenotype of a MetAP mutant as compared withthe wild-type MetAP (34). In contrast, the human enzymelacks the consensus zinc-finger sequences and instead containsthe extended runs of basic and acidic residues. The function ofthese sequences is unknown.These differences are sufficiently great to suggest that

human MetAP may be a member of a distinct subfamily ofMetAPs, distantly related to but distinct from the prokaryoticand yeast MetAPs. Strong support for this view is provided bythe recent report of a yeast ORF that is distinctly more closelyrelated to human MetAP than is either the yeast or prokaryoticMetAPs (26). This protein (designated S.c. II in Fig. 2) con-

CATALYTIC (Co+2) DOMAINkDa I---------- ----------I

30F :I

LI43L

II 53

,,E. coli-B. subtilis

S. typhimuriumn

Yeast

Porcine,Human

M. fervidus

FIG. 4. Schematic representation of the two proposed subfamiliesof MetAP.

Biochemistry: Arfin et al.

Dow

nloa

ded

by g

uest

on

Dec

embe

r 12

, 202

1

Proc. Natl. Acad. Sci. USA 92 (1995)

Table 1. Amino acid identities in the catalytic domains of MetAPs

Type I Type II

E.c. S.c. I H.s. S.c. II M.f

Type I E.c. - 104/244 50/234 48/234 41/189T S.c. I 43 - 49/231 40/231 34/189

H.s. 21 21 185/314 52/189Type II S.c. II 20 17 59 43/189

WM. 21 18 28 23 -

Entries above the diagonal refer to the number of identical residuesper total number of residues compared based on the alignments in Fig.2. Entries below the diagonal refer to percentage identities. Theremainder of the M. fervidus' sequence is unavailable. Ec., E. coliMetAP; S.c. I, S. cerevisiae MetAP; H.s., human MetAP; S.c. II, S.cerevisiae hypothetical protein YBLO91c; Mf., partial ORF from M.fervidus.

tains the same organization as human MetAP with respect tothe inserts in the catalytic domains but lacks a long N-tertninalextension [it possesses an N-terminal segment of 25 aminoacids beyond the catalytic domain that is unrelated to eitherhuman MetAP or the initially identified yeast MetAP (desig-nated S.c. I in Fig. 2)]. This yeast protein is "80% identical tothe human MetAP in the catalytic domain, including completeconservation of the putative Co2+ ligands. Thus, yeast appar-ently expresses both a prokaryotic-type and a human-typeMetAP. We propose to designate the former as MetAP-I andto designate the latter as MetAP-II. In addition, a partial ORFfrom M. fervidus, previously identified as a potential homologof rat p67 (18) (and, hence, MetAP-1I type) (see Fig. 2), showssignificant similarity to the human and yeast MetAP-II pro-teins (40% and 34% similarity, respectively, over the knownsegment), suggesting that the MetAP-II family extends acrossthe eukaryotic-prokaryotic boundary. It is, therefore, possiblethat a human MetAP-I-type enzyme also exists.The general structure of the two MetAP families is shown

in Fig. 4, and their relationships in size and similarity aresummarized in Table 1. Although all are likely Co2+ requiring(this remains to be shown for the S.c. II and M. fervidussequences), the MetAP-I family differs from the MetAP-IIfamily by significant structural features in the catalytic domain.However, more striking are the various N-terminal structures.Prokaryotic MetAPs of both families lack an N-terminaldomain, as does yeast MetAP-I1. Yeast MetAP-I and humanMetAP-II have such extensions and are clearly distinct in thisregard. Because the substrate specificity of both families issimilar (and in keeping with that expected for cotranslationalprocessing), it raises the interesting question whether one orboth function in this regard and why two such families arenecessary, at least in the case of yeast. At the least, theseobservations suggest that cotranslational processing is a morecomplex process than was originally thought.

We thank Mrs. Gale D. Trudeau for her expert assistance in thepreparation of this manuscript. This work was supported by U.S. PublicHealth Service Grants DK32465 (S.M.A. and R.A.B.) and GM20066(B.W.M.).

1. Brown, J. L. & Roberts, K. W. (1976)J. Biol. Chem. 251, 1009-1014.

2. Driessen, H. P. C., deJong, W. W., Tesser, G. I. & Bloemendahl,H. (1985) CRC Crit. Rev. Biochem. 18, 281-325.

3. Arfin, S. M. & Bradshaw, R. A. (1988) Biochemistry 27, 7979-7984.

4. Persson, B., Flinta, C., von Heijne, G. & Jornvall, H. (1985) Eur.J. Biochem. 152, 524-528.

5. Boissel, J.-P., Kasper, T. J., Shah, S. C., Malone, J. I. & Bunn,H. F. (1985) Proc. Natl. Acad. Sci. USA 82, 8448-8452.

6. Sherman, F., Stewart, J. W. & Tsunasawa, S. (1985) BioEssays 3,27-31.

7. Tsunasawa, S., Stewart, J. W. & Sherman, F. (1985)J. Biol. Chem.260, 5382-5391.

8. Moerschell, R. P., Hosokawa, Y., Tsunasawa, S. & Sherman, F.(1990) J. Biol. Chem. 265, 19638-19643.

9. Huang, S., Elliott, R. C., Liu, P. S., Koduri, R. K., Weickmann,J. L., Lee, J. H., Blair, L. C., Ghosh-Dastidar, P., Bradshaw,R. A., Bryan, K. M., Einarson, B., Kendall, R. L., Kolacz, K. H.& Saito, K. (1987) Biochemistry 26, 8242-8246.

10. Boissel, J.-P., Kasper, T. J. & Bunn, H. F. (1988) J. Biol. Chem.263, 8443-8449.

11. Ben-Bassat, A., Bauer, K., Chang, S.-Y., Myambo, K., Boosman,A. & Chang, S. (1987) J. Bacteriol. 169, 751-757.

12. Miller, C. G., Strauch, K. L., Kukral, A. M., Miller, J.-L., Wing-field, P. T., Mazzei, G. J., Werlen, R. C., Garber, P. & Mowa,N. R. (1987) Proc. Natl. Acad. Sci. USA 84, 2718-2722.

13. Chang, Y.-H., Teichert, U. & Smith, J. A. (1990) J. Biol. Chem.265, 19892-19897.

14. Kendall, R. L. & Bradshaw, R. A. (1992) J. Biol. Chem. 267,20667-20673.

15. Nakamura, K., Nakamura, A., Takamatsu, H., Yoshikawa, H. &Yamane, K. (1990) J. Biochem. (Tokyo) 107, 603-607.

16. Chang, Y.-H., Teichert, U. & Smith, J. A. (1992) J. Biol. Chem.267, 8007-8011.

17. Roderick, S. L. & Matthews, B. W. (1993) Biochemistry 32,3907-3912.

18. Bazan, J. F., Weaver, L. H., Roderick, S. L., Huber, R. & Mat-thews, B. W. (1994) Proc. Natl. Acad. Sci. USA 91, 2473-2477.

19. Hoeffken, H. W., Knof, S. H., Bartlett, P. A., Huber, R., Moel-lering, H. & Schumacher, G. (1988) J. Mol. Biol. 204, 417-433.

20. Henrich, B., Monnerjahn, U. & Plapp, R. (1990)J. Bacteriol. 172,4641-4651.

21. Butler, M. J., Bergeron, A., Soostmeyer, G., Zimny, T. & Malek,L. T. (1993) Gene 123, 115-119.

22. Yoshimoto, T., Tone, H., Honda, T., Osatomi, K., Kobayashi, R.& Tsuru, D. (1989) J. Biochem. (Tokyo) 105, 412-416.

23. Endo, F., Tanoue, A., Nakai, H., Hata, A., Indo, Y., Titani, K. &Matsuda, I. (1989) J. Biol. Chem. 264, 4476-4481.

24. Wu, S., Gupta, S., Chatterjee, N., Hileman, R. E., Kinzy, T. G.,Danslow, N. D., Merrick, W. C., Chakrabarti, D., Osterman, J. C.& Gupta, N. K. (1993) J. Biol. Chem. 268, 10796-10801.

25. Chakraborty, A., Saha, D., Bose, A., Chatterjee, M. & Gupta,N. K. (1994) Biochemistry 33, 6700-6706.

26. Feldmann, H., Aigle, M., Aljinovic, G., Andre, B., Baclet, M. C.,et al. (1994) EMBO J. 13, 5795-5809.

27. Frohmann, M. A., Dush, M. K. & Martin, G. R. (1988) Proc. Natl.Acad. Sci. USA 85, 8998-9002.

28. Jones, T. A. (1982) in Crystallographic Computing, ed. Sayre, D.(Oxford Univ. Press, Oxford), pp. 307-317.

29. Kozak, M. (1991) J. Biol. Chem. 266, 19867-19870.30. Datta, B., Ray, M. K., Chakrabarti, D., Wylie, D. E. & Gupta,

N. K. (1989) J. Biol. Chem. 264, 20620-20624.31. Hart, G. W. (1992) Curr. Opin. Cell Biol. 4, 1017-1023.32. Fuller, R. S., Brake, A. & Thorner, J. (1989) Proc. Natl. Acad. Sci.

USA 86, 1434-1438.33. Benton, B. M., Zang, J.-H. & Thorner, J. (1994) J. Cell Biol. 127,

623-639.34. Zuo, S., Guo, Q., Ling, C. & Chang, Y.-H. (1995) Mol. Gen.

Genet. 246, 247-253.

7718 Biochemistry: Arfin et aL

Dow

nloa

ded

by g

uest

on

Dec

embe

r 12

, 202

1