6
Biochimica et Biophysica Acta, 1131 (1992) 227-232 227 © 1992 Elsevier Science Publishers B.V. All rights reserved 0167-4781/92/$05.00 BBAEXP 90365 Short Sequence-Paper Cloning and nucleotide sequences of two lipase genes from Candida cylindracea Sonia Longhi, Fabrizia Fusetti, Rita Grandori, Marina Lotti, Marco Vanoni and Lilia Alberghina Dipartimento di Fisiologia e Biochimica Generali, Sezione Biochimica Comparata, Uni~,ersithdegli Studi di Milano, Milano (Italy) (Received 16 April 1992) Key words: Lipase gene; Cloning; Genomic sequence; (Yeast); (C. cylindracea) Two lipase-encoding genes (LIP1 and LIP2) have been isolated from a SacI genomic library of the yeast Candida cylindracea and their nucleotide sequences have been determined. Comparison with the sequence of a eDNA ruled out the presence of introns in the two genes. Both ORFs encode for mature proteins of 534 residues with putative signal peptides of 15 and 14 amino acids, respectively. When compared with other lipase sequences, the two C. cylindracea lipases showed homology only with the Geotrichum candidum lipase, whereas they shared a significant similarity with several esterases. Lipases (EC 3.1.2.3) are ubiquitous enzymes that hydrolyze glycerol esters of long chain fatty acids by acting at an oil/water interface [1]. They can be em- ployed for a broad range of industrial applications, such as in food, chemical, pharmaceutical and deter- gents industries [2]. Several lipases from microorgan- isms - including bacteria, yeast and fungi - plants and mammals have been purified and characterized [3]. Present efforts are devoted to the determination of the three-dimensional structure of these enzymes and to the molecular cloning of the encoding genes. The as- porogenic yeast Candida cylindracea produces extracel- lular lipase(s) commercialized and extensively used in industrial processes, particularly for transesterification and for the resolution of racemic mixtures [4]. Conflict- ing reports on the number of protein species present in commercial lipase preparations have been published [5,6]. Multiple esterasic activities with different speci- ficity have been reported in partially purified prepara- tions of the C. cylindracea enzyme [7]. A cDNA clone containing the coding sequence for lipase has been isolated from C. cylindracea [8]. In order to clarify whether the different lipase and/or esterasic activities so far described in C. cylindracea are indeed encoded Correspondence to: L. Alberghina, Sezione Biochimica Comparata, UniversitY_degli Studi di Milano, Via Celoria 26, 20133 Milano, Italy. The sequences reported in this paper have been submitted to the EMBL/Databank under the accession numbers X64703 (LIP1) and X64704 (LIP2). by separate genes, and as an aid in the elucidation of structure-function relationships for this enzyme, the molecular cloning of lipase-encoding gene(s) was un- dertaken in our laboratory. A C. cylindracea genomic library was constructed in the SacI site of plasmid pGEM37 (Promega) with yeast DNA digested to completion with Sac I and size- selected for 4-7 kbp. Screening was performed by colony hybridization [9] with two oligonucleotides syn- thesized on the basis of a C. cylindracea cDNA lipase clone [8]. Probes were designed to specifically recog- nize the 5' (5'-TCGGTGGCTGCTGCCCCCACC-3', probe A) and 3' (5'-GCGTACCCCGGCGACAT- EcoRV Sac I 1"1=1 I I-I I,I/ I A I Aria1 Bglll S~al Sell | LIP 1 Sacl Pstl Sphl Avat BamHI PstI Sacl I I[ I I B I J I I 1Kb BarnPI I Bglll~ Bgl Sphl ECOFIV } t II LIP2 Fig. 1. Restriction maps of clones A and B and localization of the LIP1 and L1P2 sequences. Location of the genes is shown by a white bar, with the coding sequence being represented by an arrow. South- ern analysis of clone B was performed with probes derived from the LIP1 sequence. DNA fragments (about 200 bp each) were separated by gel electrophoresis, labeled and used to probe Southern blots of plasmid DNA (200 ng) digested with several restriction endonucle- ases. The specific activity was 2.5.10 ~ cpm/ml. Hybridization was carried out at 42°C. Symbols identify corresponding DNA regions in the two genes.

Cloning and nucleotide sequences of two lipase genes from Candida cylindracea

Embed Size (px)

Citation preview

Page 1: Cloning and nucleotide sequences of two lipase genes from Candida cylindracea

Biochimica et Biophysica Acta, 1131 (1992) 227-232 227 © 1992 Elsevier Science Publishers B.V. All rights reserved 0167-4781/92/$05.00

BBAEXP 90365 Short Sequence-Paper

Cloning and nucleotide sequences of two lipase genes from Candida cylindracea

Sonia Longhi, Fabrizia Fusetti, Rita Grandori, Marina Lotti, Marco Vanoni and Lilia Alberghina

Dipartimento di Fisiologia e Biochimica Generali, Sezione Biochimica Comparata, Uni~,ersith degli Studi di Milano, Milano (Italy)

(Received 16 April 1992)

Key words: Lipase gene; Cloning; Genomic sequence; (Yeast); (C. cylindracea)

Two lipase-encoding genes (LIP1 and LIP2) have been isolated from a SacI genomic library of the yeast Candida cylindracea and their nucleotide sequences have been determined. Comparison with the sequence of a eDNA ruled out the presence of introns in the two genes. Both ORFs encode for mature proteins of 534 residues with putative signal peptides of 15 and 14 amino acids, respectively. When compared with other lipase sequences, the two C. cylindracea lipases showed homology only with the Geotrichum candidum lipase, whereas they shared a significant similarity with several esterases.

L ipases (EC 3.1.2.3) a re ub iqui tous enzymes that hydrolyze glycerol es ters of long chain fat ty acids by act ing at an o i l / w a t e r in ter face [1]. They can be em- p loyed for a b r o a d range of indus t r ia l appl ica t ions , such as in food, chemical , p h a r m a c e u t i c a l and de te r - gents indus t r ies [2]. Several l ipases f rom mic roorgan- isms - inc luding bac te r ia , yeas t and fungi - p lants and m a m m a l s have been pur i f i ed and cha rac t e r i zed [3]. P resen t efforts a re devo ted to the d e t e r m i n a t i o n of the t h r e e -d imens iona l s t ruc ture of these enzymes and to the mo lecu la r c loning of the encod ing genes. The as- po rogen ic yeas t Candida cylindracea produces extracel- lular l ipase(s) commerc i a l i zed and extensively used in indust r ia l processes , pa r t i cu la r ly for t ranses te r i f i ca t ion and for the reso lu t ion of racemic mixtures [4]. Confl ict- ing repor t s on the n u m b e r of p ro te in species p re sen t in commerc ia l l ipase p r e p a r a t i o n s have been publ i shed [5,6]. Mul t ip le es te ras ic activi t ies with d i f fe ren t speci- ficity have been r e p o r t e d in par t ia l ly pur i f ied p r e p a r a - t ions of the C. cylindracea enzyme [7]. A c D N A clone conta in ing the coding sequence for l ipase has been i so la ted f rom C. cylindracea [8]. In o r d e r to clarify whe t he r the d i f fe ren t l ipase a n d / o r es teras ic activit ies so far desc r ibed in C. cylindracea are indeed encode d

Correspondence to: L. Alberghina, Sezione Biochimica Comparata, UniversitY_ degli Studi di Milano, Via Celoria 26, 20133 Milano, Italy.

The sequences reported in this paper have been submitted to the EMBL/Databank under the accession numbers X64703 (LIP1) and X64704 (LIP2).

by sepa ra t e genes, and as an aid in the e luc ida t ion of s t r u c t u r e - f u n c t i o n re la t ionsh ips for this enzyme, the molecu la r cloning of l ipase -encod ing gene(s) was un- d e r t a k e n in our labora tory .

A C. cylindracea genomic l ibrary was cons t ruc ted in the SacI site of p lasmid p G E M 3 7 (P romega) with yeast D N A d iges ted to comple t ion with Sac I and size- se lec ted for 4 - 7 kbp. Screen ing was p e r f o r m e d by colony hybr id iza t ion [9] with two o l igonuc leo t ides syn- thes ized on the basis of a C. cylindracea c D N A l ipase clone [8]. Probes were des igned to specif ical ly recog- nize the 5 ' ( 5 ' - T C G G T G G C T G C T G C C C C C A C C - 3 ' , p robe A) and 3' ( 5 ' - G C G T A C C C C G G C G A C A T -

EcoRV Sac I

1"1=1 I I-I I , I / I A I

Aria1 Bg l l l S~al Se l l |

L I P 1

Sacl Pst l Sph l Avat BamHI Pst I Sac l

I I[ I I B I J I I 1 K b

BarnPI I Bg l l l~ Bgl Sph l ECOFIV } t II

L I P 2

Fig. 1. Restriction maps of clones A and B and localization of the LIP1 and L1P2 sequences. Location of the genes is shown by a white bar, with the coding sequence being represented by an arrow. South- ern analysis of clone B was performed with probes derived from the LIP1 sequence. DNA fragments (about 200 bp each) were separated by gel electrophoresis, labeled and used to probe Southern blots of plasmid DNA (200 ng) digested with several restriction endonucle- ases. The specific activity was 2.5.10 ~ cpm/ml. Hybridization was carried out at 42°C. Symbols identify corresponding DNA regions in

the two genes.

Page 2: Cloning and nucleotide sequences of two lipase genes from Candida cylindracea

228

CACC-3’, probe B) half of the gene, respectively. Two positive clones containing plasmids with either a 4.2 kbp (clone A) or a 5.1 kbp (clone B) SacI-Sac1 insert were isolated (Fig. 1).

Southern blot analysis allowed to locate the se-

quence reacting with the lipase-specific probes within a 1.7 kbp SacI-EcoRV fragment of clone A (Fig. 1). This

fragment was entirely sequenced on both strands by the dideoxy chain termination method [lo] (Fig. 2A).

The cloned sequence contained a single 5’ truncated open reading frame (ORF) of 1647 nucleotides - that we refer to as LIP1 - terminating with a TAA stop codon. The nucleotide sequence of the genomic clone matched that reported for the complete cDNA of C.

cylindrucea lipase [ll], the only difference being a substitution (A + G) in position 4, resulting in a re-

placement Lys (cDNA)/Glu. The comparison between the genomic and the cDNA sequences showed that the

ACC ATC ACC GGT CTC AAC GCC ATC ATC AAC GAG GCG TTC CTC GGC ATT CCC TTT GCC GAG CCG CCG GTG GGC AAC Thr 11s Thr Gly Lr" Asn Ala 11s Il. As" Cl" Ala Phm Lou Gly Il. Pro Phe Als Glu Pro Pro Vsl Gly Asn

150 35

CTC CGC TTC AAG GAC CCC GTG CCG TAC TCC GGC TCG CTC GAT GGC CAG AAG TTC ACG CT6 TAC GGC CCG CT6 TGC Lou Arg Phs Lys Asp Pro Vsl Pro Tyr Ssr Gly Ssr Lou Asp Gly Gln Lys Ph. Thr S.r Tyr Gly Pro S.r Cys

225 60

ATG CAG CAG AAC CCC GAG GGC ACC TAC GAG GAG AAC CTC CCC AAG GCA GCG CTC GAC TTG GTG ATG CAG TCC AAG Hot Gln Gln Asn Pro Glu Gly Thr Tyr Glu Cl" Asn Lou Pro Lys ALs Als Lsu Asp Ls" Vsl Met Gln Ser Lys

300 65

GTG TTT GAG GCG GTG CT6 CCG CT6 AGC GAG GAC TGT CTC ACC ATC AAC GTG GTE CGG CCG CCG GGC ACC AAG GCG Vsl Phs Glu Als Vsl Sor Pro Sor Ssr Glu Asp Cyr Lou Thr 11s Am Vsl Vsl Arg Pro Pro Gly Thr Lys 11s

375 110

GGT GCC AAC CTC CCG GTG ATG CTC TGG ATC TTT GGC GGC GGG TTT GA6 GTG OGT 66C ACC AGC ACC TTC CCT CCC Gly Als Am L." Pro Vsl blat Lo" Trp Il. Ph. Gly Gly Gly Ph. 61~ Vsl Gly Gly Thr Ssr Thr Phs Pro Pro

450 13s

GCC CAG ATG ATt ACC AAG AGC ATT CCC ATG GGC AAG CCC ATC ATC CAC GTG AGC 6TC AAC TIC CGC GTG TCG TCG A~J Gin Hst 11s Thr Lyr Sor 11s Als Hst Gly Lys Pro Ilo Ilo His Vsl Sor Vsl Asn Tyr Arg Vsl Ssr Ssr

525 160

TGG GGG TTC TTG GET GGC GAC GAG ATC AA5 CCC GAG GGC AGT GCC'AAC GCC 6GT TTG AAG GAC CAG CGC TTG GGC Trp Gly Ph. L." Als Gly Asp Glu 11s Lyr ALs Gl" Gly Ssr Als Asn ALs Gly Lsu Lys Asp Gln Arg Ls" Gly

600 165

ATG CAB TGG GTG GCG GAC AAC ATT GCG GCG TTT GGC GGC GAC CCG ICC AAG GTG ACC ATC TTT GGC GAG CTC GCG Met Gln Trp Vsl Ala Asp Asn IIs Ala Ala Phs Gly Gly Asp Pro Thr Lyr Val Thr 11s Phs Gly Glu for Ala

675 210

GGC AGC ATG TCG GTC ATG TGC CAC ATT CTC TGG AAC GAC GGC GAC AAC ACG TAC AAG GGC AAG CCG CTC TTC CGC 750 Gly Ser Met Ser Val Met Cys His Ile Leu Trp Am Asp Gly Asp Asn Thr Tyr Lys Gly Lyr Pro Ls" Phs Arg 235

GCG GGC ATC ATG CAG CT6 GGG GCC ATG GTG CCG CT6 GAC GCC GTG GAC GGC ATC TAC GGC AAC GAG ATC TTT GAC 825 Ala Gly 11s Met Gln Smr Gly Ala Met Val Pro Smr Asp Ala Val Asp Gly Ilo Tyr Gly Asn Glu Its Ph. Asp 260

CTC TTG GCG TCG AAC GCG GGC TGC GGC AGC GCC AGC GAC AAG CTT GEG TGC TTG CGC GGT GTG CT6 AGC GAC ACG Lo" Leu Ala Ser As" Al.3 Gly Cys Gly Ser Ala Ser Asp Lys Lo" Ala Cyr Lsu Arg Gly Vsl Sd Ssr Asp Thr

900 285

TTG GAG GAC CCC ACCAcAACTcrICCT GGG TTC TTG GCG TAC TCC TCG TTG CGG TTG CT6 TAC CTC CCC CGG CCC Lou Glu Asp Ala Thr Asn Asn ThrlPro Gly Ph. Lsu Ala Tyr S*r Ssr Lou Arg Lo" Sor Tyr Ls" Pro Arg Pro

975 310

GAC GGC GTG AA IGAC

E

GAC ATG TAC GCC TTG GTG CGC GAG GGC AAG TAT GCC AAE ATC CCT GTG ATC ATC Asp Gly Val Asn Ilo ThrmAsp Asp Mot Tyr Als Lou Vsl Arg Gl" Gly Lys Tyr Als Am 11s Pro Vsl Il. Ils

1050 335

GGC GAC CAG AAC GAC GAG GGC ACC TTC TTT GGC ACC CTC CT6 TTGlAAt GTG ACC ACG GAT GCC CAG GCC CGC GAG Gly Asp Gln As" Asp Glu Gly Thr Ph. Ph. Gly Thr Sor Sor LsdAsn Vsl Thr Thr Asp Als Gin Als Arg Cl"

1125 360

TAC TTC AAG CAG CT6 TTT GTC CAC GCC AGC GAC GCG GAG ATC GAC ACG TTG ATG ACG GCG TAC CCC GGC GAC ATC Tyr Phe Lys Gin SW Phe Vsl His Ala Ser Asp Als Glu 110 Asp Thr Lsu Hot Thr Ala Tyr Pro Gly Asp 11s

1200 385

&C& CAG GGC CT6 CCG TTC GAC ACG GGT ATT CTC AAC GCC CTC ACC CCG CAG TTC AAG AGA ATC CT6 GCG GTG CTC Thr Gln Gly Ser Pro Phe Asp Thr Gly Its Lsu Asn Ala Ls" Thr Pro Gin Ph. Lyr Arg It. SW Als Vsl LOU

1275 410

GGC GAC CTT GGC T'TT ACG CTT GCT CGT CGC TAC TTC CTC AAC CAC TAC ACC GGC GGC ACC AAG TAC TCA TTC CTC Gly Asp Lou Gly Phe Thr Lsu Als Arg Arg Tyr Phe Ls" Asn His Tyr Thr Gly GLy Thr Lyr Tyr Ssr Ph. Lou

1350 435

CT6 AAG CAG CTC CT6 GGC TTG CCG GTG CTC CGA Att TTC CAC TCC AAC GAC ATT GTC TTC CAB GAC TAC TTG TTG SW Lyr Gln Lsu SW Gly Lsu Pro Vsl Lou Gly Thr Phs His Ssr Am Asp 11s Vsl Phs Gin Asp Tyr Lou Lsu

1425 460

GGC AGC GGC TCG CTC ATC TAC AAC AAC GCG TTC ATT GCG TTT GCC ACE GAC TTG GAC CCC AAC ACC GCG GGG TTG Gly Sor Gly Sar Lou 11s Tyr As" As" Ala Phe Ile Ala Phe Ala Thr Asp Lou Asp Pro Asn Thr Ala tly Lwu

1500 485

TTG GTG AAG TGG CCC GAG TAC ACC AGC AGC CTC CAG CT6 GGC AAC AAC TlG ATG ATG ATC AAC GCC TTG GGC TTG LDU Vsl Lyt Trp Pro Gl" Tyr Thr Ser Ssr S.r Gln S.r Gly As" Am Lou Mat Hot IL* Asn Ala Lsu Gly Lsu

1575 510

TAC ACC GGC AAG GAC AAC TTC CGC ACC GCC GGC TAC GAC GCG TTG TTC TCC AAC CCG CCG CT6 TTC TTT GTG TAA 1650 Tyr Thr Gly Lys Asp As" Phc Arg Thr Ala Gly Tyr Asp Ala La" Phs Ssr Am Pro Pro tar Phs Phe Val ..' 534

75 10

Fig. 2. Nucleotide and deduced amino acid sequences of LIP1 (A) and LIP2 (B). Nucleotides are numbered starting from the translational start

site as position + 1. Amino acids are numbered taking the first residue of the mature proteins as position + I. Sequences corresponding to

oligonucleotides A and B are underlined in the LIP2 sequence. A putative TATA-box is underlined upstream of the LIP2 gene. Potential N-glycosylation sites are boxed. The close triangles indicate the signal sequence cleavage sites. Serines encoded by CTG codons are shown in

bold type. ATG of the LIP1 gene is shown in parentheses because not included in the SucI-Sac1 clone.

Page 3: Cloning and nucleotide sequences of two lipase genes from Candida cylindracea

B AGGCTGTCGCACGTCACTGCTGCTTGGCTTTGCCAGCATGTGGCCGCCACGCCCCGTCTGC ACGTTCTCTGGTCTTTA•TGCATTGGAGCCAGGAACCGTAGTGCAA•CCCGCGCCGGTCCGGGGTATAAAAGCGGAAGCACTCTCACTcCAGTCTCT•C

ATO AAG CTC TGT TTG CTT GCT CTT GGT GCT GCG GTG GCG GCA GCC CCC ACG GCC ACC CTC GCC AAC GGC GAC ACC 75 Net Ly$ Leu Cyt Leu Leu ALa Leu GLy Ala Ala VaL ALa Ala Ala Pro Thr Ala Thr Leu Ala Asn Gly Asp Thr 11

ATC ACC GGT CTC AAC GCC ATT GTC AAC GAA AAG TTT CTC GGC ATA CCG TTT GCC GAG CCG CCC GTO GGC ACG CTC 150 I t s Thr Gty Leu Ash Aia ILe Val Asn GIu Lye Phe Leu Giy ILe Pro Phe A[a GLu Pro Pro Val Gly Thr Leu 36

CGC TTC AAG CCG CCC GTG CCG TAC TCG GCG TCG CTC AAC GOC HAG HAG TTT ACC CTG TAC GGC CCG CT6 TGC ATG 225 Arg Phe Ly$ Pro Pro Val Pro Tyr $ar A la $ar Lau Asn Gly Gln Gln Pha Thr Set Tyr GLy Pro Set Cy$ Met 61

HAG ATG AAC CCT ATG GGC TCG TTT GAG GAC ACA CTT CCC AAG AAT GCG COG CAT TTG GTG CTC HAG TCC AAG ATC 300 Gln Met Asn Pro Hat G[y Ser Pha Gtu Asp Thr lau Pro Ly$ Asn ALa Arg H i t Leu Val Leu Gln Ser Lye I l e 86

TTC CAA GTG GTG CTT CCC AAC GAC GAG GAC TGT CTC ACC ATC AAC GTG ATC CGG CCG CCC GGC ACC AGG GCC AGT 375 Phe GLn Va[ Vai Leu Pro Ash Asp Glu Asp Cy$ Leu Thr [ l e Asn VaL ILe Arg Pro Pro GLy Thr Arg ALa Ser 111

GCT GGT CTC CCG GTG ATG CTC TGG ATC TTT GGC GGT GGG TTT GAG CTT GGC GGC TCC AGC CTC TTT CCA GGA GAC 450 Ale GLy Lau Pro Vel Met Leu Trp I l e Phe GLy Giy GLy Phe Glu Leu GLy Gly Ser $er Leu Phe Pro Gty Asp 136

CAG ATG GTG GCC AAG AGC GTG CTC ATG GGT AAA CCG GTG ATC CAC GTG AGC ATG AAC TAC CGC GTG GCG TEA TGG 525 Gin Met Vel A la Lye Ser VaL Leu Met Gly Ly$ Pro VeL I l e His VeL $er Met Asn Tyr Arg VsL ALa Ser Trp 161

GGG TTC TTG GCC GGC CCC GAC ATC HAG AAC GAA GGC AGC GGG AAC GCC GGC TTG CAT GAC HAG CGC TTG GCC ATG 600 Gly Phe Lau ALe GLy Pro Asp ILe GLn Asn GLu GLy $er GLy Asn Ala GLy Lau His Asp GLn Arg Leu ALe Met 186

HAG TGG GTG GCG GAC AAC ATT GCT GGG TTT GGC GGC GAC CCG AGC AAG GTG ACC ATA TAC GGC GAG CT6 GCG GGC 675 Gln Trp Vel ALe Asp Asn I l e Ale GLy Phe GLy Gty Asp Pro Ser Lye Vsl Thr ILe Tyr GLy Glu Set Ale GLy 211

AGC ATG TCG ACG TTT GTG CAC CTT GTG TGG AAC GAC GGC GAC AAC ACG TAC AAC GGC AAG CCG TTG TTC CGC GCC 750 Ser Met Ser Thr Phe Vsi His Leu VsL Trp Asn Asp Gly Asp Ash Thr Tyr Ash GLy Lye Pro Leu Phe Arg Ala 236

GCC ATC ATG CAG CTG GGC TGC ATG GTG CCG CTG GAC CCG GTG GAC GGC ACG TAC GGC ACC GAG ATC TAC AAC CAG 825 ALe l l e Met Gln Ser Gly Cys Mat Val Pro Ser Asp Pro Val Asp GLy Thr Tyr Gly Thr Glu I l e Tyr Asn Gln 261

GTG GTG GCG TCT GCC GGG TGT GGC AGT GCC AGC GAC AAG CTC GCG TGC TTG CGC GGC CTT CTG HAG GAC ACG TTG 900 Vel Val A le Ser Ale GLy Cys Gly $ar Ale Ser Asp Lye Leu Ale Cys Lau Arg Gly Leu Ser GLn Asp Thr Leu 286

TAC HAG GCC ACG AGC GAC ACG CCC GGC GTG TTG GCG TAC CCG TCG TTG CGG TTG CTG TAT ETC CCG EGG CCC GAC 975 Tyr Gln A le Thr $er Asp Thr Pro Gly Val Leu Ale Tyr Pro Ser Leu Arg Lau Set Tyr Leu Pro Arg Pro Asp 311

GGC ACC TTC ATC ACC GAC GAC ATG TAT GCC TTG GTG CGG GAC GGC AAG TAC GCA CAC GTG CCG GTG ATC ATC GGC 1050 Gly Thr Phe %La Thr Asp Asp Met Tyr ALe Leu Ve[ Arg Asp Gly Lye Tyr A la His VaL Pro VsL l l e ILe GLy 336

GAC CAG AAC GAC GAG GGC ACT TTG TTT GGG CTC CTG CTG TTG AAC GTG ACC]ACA GAT GET CAG GCA CGG GCG TAC 1125 Asp Gin Asn Asp GLu GLy Thr Lau Pha Gly Leu $er Ser Leu Asn Vsl T h r j T h r Asp Ala GLn ALe Arg ALa Tyr 361

TTC AAG HAG CTG TTC ATC CAC GCC AGC GAT GCG GAG ATC GAC ACG TTG ATG GCG GCG TAC ACC AGC GAC ATC ACC 1200 Phe Lye Gln Set Pha I i e His Ale $er Asp Ale Glu I l e Asp Thr Leu Mat A la A le Tyr Thr Sar Asp ILe Thr 386

HAG GGT CTE CCG TTC GAC ACC GGC ATC TTC AAT GCC ATC ACC CCG HAG TTC AAA CGG ATC CTG GCG TTG CTT GGC 1275 Gln GLy Ser Pro Phe Asp Thr GLy ILe Pha Asn Ala l l a Thr Pro GLn Pha Lye Arg ILe $er ALe Leu Lau GLy 411

GAC CTT GCG TTC ACG CTT GCG CGT CGC TAC TTC CTC AAC TAC TAC HAG GGC GGC ACC AAG TAC TCG TTT CTC CTG 1350 Asp Lau A l l Pha Thr Lsu Ale krg Arg Tyr Phe Lsu Asn Tyr Tyr Gtn Gly Gty Thr Lye Tyr Sar Pha Lsu Sar 436

AAG HAG CTT CT$ GGG TTG CCC GTC TTG GGC ACC TTC CAC 6GC AAC GAC ATC ATC TGG HAG GAC TAC TT6 GTG GEC 1425 Lye Gln Lsu Ser GLy Lau Pro VaL Lsu Gly Thr Phs His GLy Asn Asp I l e I i a Trp GLn Asp Tyr Lsu Vat Gly 461

AGC GGC AGT GTG ATC TAC AAC AAC GCG TTC ATT GCG TTT GCC AAC GAC £TC GAC CCG AAC AAG GCG GGC TTG TGG 1500 Sar Gly Sar Va[ ]Ls Tyr Ash ksn A[a Pha i l s ALe Phe Ale ksn Asp Leu Asp Pro Asn Lye ALa Gly Lsu Trp 486

ACC AAC TGG CCC ACG TAC ACC AGC AGT CTG CAG CT6 GGC AAC AACTTG ATG CAG ATC AAC GGC TTG GGG TTG TAC 1575 Thr Asn Trp Pro Thr Tyr Thr $ar Ser Sat GLo Ser GLy Ash Asn leu Mat GLn lie ASh Gly leu Gly leu Tyr 511

ACC GGC AAG GAC AAC TTC CGC CCG GAT GCG TAC AGC GC¢ CTC TTT TEE AAC CCG CCA CTG TTC TTT GTG TAG 1650 Thr Gly Lye Asp Asn Phe Arg Pro Asp ALe Tyr Ser ALa Lmu ghe Ser Ash Pro Pro $er Phe Phe VsL .'. 534

GAAC~GAATGCACGG~GTCGAGCAAACACCCGGAGCGGGC~GGACA~GGAGCAGGCTTGGACGCAAAGCAGGCTCGGACA~GGAGCAGGC~GGG~TC ~GCG~T~G~ACGGACTTATCAG~GGGGTGATCAGGGGCAAGCCTG~G~A~GGCGCCGAGTGAGCTGAGGCGGC~GCACGAACAAGCCCTGAAACCC AACCGCCGTCTGTGGTTGGTGGATCAGGATCTGCATGC

Fig. 2. (continued).

229

LIP1 sequence spans the whole gene starting from the second codon and that this gene does not contain introns.

The yeast strain ATCC 14830 has been reported to use a non-universal genetic code, with CTG coding for serine instead of [eucine [8]. The amino acid sequences described in this work have been determined accord- ingly. The NHz-terminal sequence of the LIP1 protein is enriched in hydrophobic amino acids and exhibits the properties of a secretion signal sequence [12]. The signal peptidase cleavage site is predicted to occur following the sequence ValAlaAla, thus producing an

amino-terminal sequence of the mature product fully matching that obtained from peptide sequencing [8]. LIP1 encodes a 549 amino acids preprotein that, upon cleavage of the 15 amino acids signal sequence, results in a mature protein of 534 residues with a predicted molecular weight of 57 223. Such a molecular weight is in agreement with that experimentally determined for the purified lipase protein by polyacrylamide gel elec- trophoresis [5,6]. Three potential N-glycosylation sites (Asn-X-Ser/Thr) [13] were identified at position 291, 314 and 351 of the putative mature protein. The native enzyme has been shown to contain 4.2% carbohydrate

Page 4: Cloning and nucleotide sequences of two lipase genes from Candida cylindracea

230

and mannose and xylose have been identified as glu- cidic components [14]. The predicted isoelectric point is 4.5. The consensus sequence Gly-X-Ser-X-Gly shared by all lipases sequenced to date [3] has been identified surrounding Ser-209 of the mature protein, as previ- ously reported [15].

A preliminary characterization of clone B revealed a restriction pattern different from that of clone A sug- gesting the presence of a second lipase sequence (Fig. 1). This was localized by Southern analysis using four different probes derived from the LIP1 gene as shown in Fig. 1. The nucleotide sequence was determined and revealed the presence of one ORF (LIP2) of 1644 nucleotides ending with a TAG stop codon (Fig. 2B). LIP2 encodes a preprotein of 548 amino acids, one residue shorter than the LIP1 preprotein. Analysis of the amino-terminus allowed to predict a leader peptide of 14 amino acids. The mature protein would therefore

consist of 534 amino acids with a predicted molecular weight of 57744. One out of the three potential N- glycosylation sites of the LIP1 protein is also conserved in the LIP2 protein in the correspondence of Asn-351. The predicted isoelectric point is 4.9. The lipase con- sensus sequence has been identified in the stretch of residues surrounding Ser-209.

Fig. 3 shows the alignment of the deduced amino acid sequences of LIP1 and LIP2 mature proteins, showing a high degree of identity (79%). The molecu- lar cloning of two lipase-encoding genes confirms and extends data previously obtained in our laboratory by genomic Southern analysis [16].

Although the control of gene expression in Candida spp. has not been extensively studied, expression sig- nals have been reported to resemble those of Saccha- romyces cere~,isiae [17,18]. A TATA-Iike sequence (TATAAAA) is found 35 bp upstream of the initiator

[~ PTATLAHGDTITGLNAI IN PTATLAMGDTITGLNAIVM

~ ' ] I [ F L G I P F A Z l' 1' V FLGIPFAZP PV $[L R F KJPIP V P ¥ 50

[-~[ QI DL[ .QS~VrJSv~-~s[~s s [ ~ o c ~ ~] ~oo I, T S ! G e S C M ~IHe lML~SLF S DITIL e KISlAIa n L V L Q $ K I F Q L N D E D C L T I00

v i m e e o T • SL.~alL e v I L I z r o I g r | I L I Q ~ IS lS lL I~ elO PlO • VlJq~ s v l ~ . ~ ~ e v 1so

$WGFL& $WGFLA

k All

°' oi. GIelD I IQ , i.ooo:l ,oo

A G L R A M Q W v A D N I AJG[F G G D 200

:I :I I II 250 K V T I F G | $ A G $ M VM C I L ff N D G D N T ¥ K G K P L F R G I M O $ G]AJM V P S K V T I ¥ G E S A G S M T F V L V W N D G D N T ¥ N G K P L F R A I M Q $ GIC[M V P S D P 250

300 N Q v v A S A G C G S A S D K L A C L R G LLSJQ[D T LiT S D[T P GJV[L A YJP 300

!: ' ' ' ' ' ' - ' ' ' o o" :1 ' ' o o ' ' ' ' v ' ' ' l ' :l I ' v x ' ° ° ° ' ° ' ° ' ' ' 3,0 L R L S ¥ L p R p D"~GJT I T D D M Y ALV R D GK T V P V I I G D ~ N D E G T L F GL 350

V T T D k O ARIEI~ F [ O $ F V ! A $ D & | I D T L OG $ P F D T G I LN AL 400 V T T D A O A R I A [ Y F K Q S F I B A S D A | I D T L M A T S Z T O G $ P F D T G I F N A I 400

i : ° , l , x , : l ~ ~ ~ l , , . . ~ , , . , , , , . m , l o ° , E , , , , , E 0 , , ° L , v , o , ' ~]s ,so Q F K R I $ F T L ~RK T F LNJ¥[_~Q[G G T K ¥ $ F L $ K~ L $ GL P V LG T F G 450

i : 0 , v , 0 0 , , I L r ~ c ~ , . x , . . ~ , x ~ , ~ . , I 0 , 0 ' :J , [ f f ~ ] , v i ~ . l , , s s s 0 s U ~] ,00 D I I II Q D Y LIVlG S G SJVII ~r ~ ~ A r I A F AI~I D L D P K W T NIW e]TIY T S S S ~ S G S00

l f f l ] . ~ l L ° L , , ° K 0 . , I 1 , ~ ° r ~ o l ~ , , , , , , s , , v I 53, OIl~_~JgiLG L Y T G K O ~ r e D A[~JSl AL r S ~ e e S r r 534

Fig. 3. Alignment of deduced amino acids sequences of LIP1 (upper line) and LIP2 (lower line) mature proteins. Sequences are in the one letter code. Residues are numbered starting from the first amino acid of the mature product. Identical residues are boxed.

Page 5: Cloning and nucleotide sequences of two lipase genes from Candida cylindracea

TABLE I

Codon usage in lipase genes

Values are given as a percentage considering CTG coding for serine.

LIP1 LIP2 LIP1 LIP2

Gly G G G 0.09 0.14 End T G A 0.00 0.00 G G T 0.11 0.11 Cys T GT 0.20 0.50 GGC 0.79 0.73 T GC 0.80 0.50 G G A 0.02 0.02 End T AG 0.00 1.00

Glu G A G 1.00 0.80 T A A 1.00 0.00 G A A 0.00 0.20 Tyr T AT 0.05 0.09

Asp G A T 0.06 0.09 TAC 0.95 0.91 GAC 0.94 0.91 Leu TTG 0.44 0.43

Val GTG 0.86 0.94 TTA 0.00 0.00 GTA 0.00 0.00 Phe TTT 0.39 0.47 GTT 0.00 0.00 TTC 0.61 0.53 GTC 0.14 0.06 Set TCG 0.17 0.13

Ala GCG 0.40 0.43 TCA 0.02 0.02 GCA 0.02 0.06 TCT 0.00 0.02 GCT 0.10 0.10 TCC 0.11 0.07 GCC 0.48 0.41 Arg CGG 0.21 0.44

Arg A G G 0.00 0.06 C GA 0.00 0.00 A G A 0.07 0.00 CGT 0.07 0.06

Ser A G T 0.02 0.09 CGC 0.64 0.44 AGC 0.25 0.29 Gin CAG 1.00 0.96

Lys A A G 1.00 0.88 CAA 0.00 0.04 A A A 0.00 0.12 His CAT 0.00 0.29

Ash A A T 0.00 0.06 CAC 1.00 0.71 AAC 1.00 0.94 Ser CTG 0.42 0.38

Met ATG 1.00 1.00 Leu CTA 0.00 0.00 lie ATA 0.00 0.07 CTT 0.08 0.21

ATT 0.27 0.11 CTC 0.48 0.36 ATC 0.73 0.81 Pro CCG 0.52 0.59

Thr ACG 0.31 0.32 CCA 0.00 0.06 ACA 0.00 0.06 CCT 0.10 0.03 ACT 0.00 0.03 CCC 0.39 0.32 ACC 0.69 0.59

Trp TGG 1.00 1.00

codon of the LIP2 gene (Fig. 2B). The restriction map of this gene is identical to that reported for a C. cylindracea cDNA [11]. Evidence of the expression of different lipases by this yeast has been inferred from cDNA analysis [11]. The production of different lipase isoforms would be in agreement with reports pointing to the presence of multiple forms of extracellular li- pases in commercial enzyme preparations [5,6]. The presence of lipase isozymes has been described in Geotrichum candidum [19], Aspergillus [20] and Rhi- zomucor michel [21].

Table I summarizes codon usage frequencies in the LIP1 and LIP2 genes. The reported values have been calculated taking CTG as coding for serine [8] and assuming seven different codons for serine (instead of six) and five (instead of six) for leucine. Some extent of preference is observed for particular triplets, including CTG that apparently does not undergo any negative selection. Although the codon usage preference is simi- lar in the two genes, some significant differences might

231

indicate a not recent event of duplication as the origin of the two genes.

A data base search with FASTA program [22] for sequences homologous to C. cylindracea lipase se- quences did not allow the identification of any other homologous lipase sequence than that of G. candidum [15] (45% identity over 544 amino acids overlap). Be- sides G. candidum lipase, best scoring sequences in the output of FASTA were obtained with the family of acetylcholinesterases, in particular those from Torpedo marmorata [23] (28% identity over 446 amino acids), T. californica [24] (28% identity over 446 amino acids) and Drosophila melanogaster [25] (30% identity over 263 amino acids). Alignments were performed by PILEUP (GCG package, Genetics Corporated Group Wiscon- sin). Multiple sequence alignments were obtained by first aligning mature lipases and acetylcholinesterases separately and then aligning profiles of the two groups [26]. Interestingly, besides the high overall similarity, the consensus pattern for the active site of acetyl- cholinesterase, GESAG [27], is identically conserved in both fungal lipases. This alignment strongly suggests a catalytic role for Ser-209 of the C. cylindracea lipases, since the corresponding residue is part of the active site of both G. candidum lipase [28] and T. californica acetylcholinesterase [29]. The strong sequence homol- ogy between both fungal lipases and the esterase family may reflect common structural and catalytic features of these enzymes.

The authors wish to acknowledge M. Vingron (EMBL) for his help in sequence alignment and P. Sarmientos (Farmitalia) for providing oligonucleotide probes. This work has been partially supported by a grant of the CNR Progetto Finalizzato Biotecnologie e Biostrumentazione to L.A.

References

1 Brockman, H.L. (1984) in Lipases (Borgstroem, B. and Brock- man, H.L., eds.), pp. 4-46, Elsevier Science Publishers, Amster- dam.

2 Harwood, J. (1989) Trends Biochem. Sci. 14, 125-126. 3 Antonian, E. (1988) Lipids 23, 1101-1106. 4 Klibanov, A.M. (1989) Trends Biochem. Sci. 14, 141-144. 5 Veeraragavan, K., Colpitts, T. and Gibbs, B.F. (1990) Biochim.

Biophys. Acta 1044, 26-33. 6 Shaw, J.F., Chang, C.H. and Wang, Y.J. (1989) Biotechnol. Lett.

11, 779-784. 7 Brahimi-Horn, M.C., Guglielmino, M.L., Elling, L. and Sparrow,

L.G. (1990) Biochim. Biophys. Acta 1042, 51-54. 8 Kawaguchi, Y., Honda, H., Taniguchi-Morimura, J. and lwasaki,

S. (1989) Nature 341, 164-166.

9 Miyada, C.G. and Wallace, R.B. (1987) Methods Enzymol. 154, 94-107.

10 Sanger, F., Nicklen, S. and Coulson, A.R. (1977) Proc. Natl. Acad. Sci. USA 74, 5463-5467.

11 Kawaguchi, Y. and Honda, H. (1991) in Lipases: Structure, mechanism and genetic Engineering (Alberghina, L., Schmid R.D. and Verger, R., eds), GBF Monographs No. 16, pp. 221-230, VCH, Germany.

Page 6: Cloning and nucleotide sequences of two lipase genes from Candida cylindracea

232

12 Von Heine, G. (1986) Nucleic Acids Res. 14, 4683-4690. 13 Hubbard, S.C. and lvatt. R.J. (1981) Annu. Rev. Biochem. 50,

555-583. 14 Tomizuka, N., Ota, Y. and Yamada, K. (1966) Agr. Biol. Chem.

30, 1090-1096. 15 Shimada, Y., Sugihara, A., Hzumi, T. and Tominaga, Y. (1990) J.

Biochem. 107, 703-707. 16 Alberghina, L., Grandori, R., la,'mghi, S., Lotti, M., Fusetti, F.

and Vanoni, M. (1991) in Lipases: Structure, Mechanism and Genetic Engineering (Alberghina, L., Schmid, R.D. and Verger, R., eds), GBF Monographs No. 16, pp. 231-235, VCH, Germany.

17 Takagi, M., Kobayashi, N., Sugimoto, M., Fujii, T., Wataki, J. and Yano, K. (1987) Curt. Genet. 11,451-457.

18 Singer, S.C., Richards, C.A., Ferone, R., Benedict, D. and Ray, P. (1989) J. Bacteriol. 171, 1372-1378.

19 Sugihara, A., Shimada, Y. and Tominaga, Y. (1990) J. Biochem. 107, 426-43(/.

20 Hofelman, M., Hartmann, J., Zink, A. and Schreier, P. (1985) J. Food Sci. 50, 1721 1731.

21 Boel, E., Huge-Jensen, B., Christensen, M., Thim, L. and Fiil, N.P. (1988) Lipids 23, 7/11-706.

22 Pearson, W.R. and Lipman, D.J. 11988) Proc. Natl. Acad. Sci. USA 85, 2444 2448.

23 Sikarov, J.L., Krejci, E. and Massoulie', J. (1987) EMBO J. 6, 1865-1873.

24 Schumaker, M., Camp, S., Maulet, Y., Newton, M., McPhee- Quigley, K., Taylor, S.S., Friedmann, T. and Taylor, P. (1986) Nature 319, 407 409.

25 Hall, L.M.C. and Spierer, P. (1986) EMBO J. 5, 2949-2954. 26 Vingron, M. and Argos, P. (1989) CABIOS 5, 115-121. 27 Chatonnet, A. and Lockridge, O. (1989) Biochem. J. 260, 625-634. 28 Schrag, J.D., Li, Y., Wu, S. and Cygler, M. (1991) Nature 351,

761-764. 29 Sussmann, J.L., Harel, M., Frolow, F., Oefner, C., Goldman, A ,

Toker, k. and Silman, I. (1991) Science 253, 872-878.