Nucleotide sequence of a cDNA encoding a tobacco thioredoxin

Plant Molecular Biology 17: 143-147, 1991. © 1991 Kluwer Academic Publishers. Printed in Belgium.

Update section Sequence

Nucleotide sequence of a cDNA encoding a tobacco thioredoxin

143

Isabelle Marty and Yves Meyer* Laboratoire de Physiologie et Biologie Moldculaire Vdg~tales, Unit( de Recherche Associ~e au CNRS 565, avenue de Villeneuve, 66860 Perpignan, France (*author for correspondence)

Received 5March 1991; accepted 13March 1991

Thioredoxins are small (about 12 kDa) proteins which reduce disulphide bridges on target enzymes [6]. Protein or nucleic acid sequences have been established for different thioredoxins from vertebrates [7, 8, 15, 16], algae [3, 4, 12] and bacteria [6, 10]. The active site shows a highly conserved sequence -C-G-P-C-. Recently a larger consensus thioredoxin signature (AT)-x-W-C- (AG)-(PH)-C has been defined in the PROSITE database [ 1 ]. It matches all thioredoxins but no other protein sequences. In higher plants a spinach f-type cDNA [9], an m-type cDNA [ 14] and an m-type protein [ 11 ] have been sequenced. The f-type has a signal peptide typical of chloroplast targeting. Both enzymes are encoded in the nucleus, but are localized and active in the chloroplast in their mature form. M-type activates NADP-malate dehydrogenase wild f-type activates enzymes of the photosynthetic carbon cycle. In addition, cytosolic thioredoxin activities have been characterized in spinach. The proteins have been purified but not yet sequenced [5].

We have isolated a complete or nearly complete clone from a cDNA library from Nicotiana tabacum cells cultivated in vitro. This 699 nucleotide insert hybridizes on northern blots with a 0.9 kb mRNA. It contains an ORF of 378 bp (Fig. 1). The 3' untranslated region is 242bp long, including 16bp of poly(A) ÷ tail. Two AT-rich regions could serve as polyadenylation signals. The ORF codes for a protein of 126

amino acids. At positions 43-49 it contains a sequence which perfectly matches the thioredoxin signature. In addition, the protein sequence shows 40-50 ~o homologies with Chlamydomonas Chl assumed to be a cytoplasmic protein [4] and vertebrate thioredoxins using FASTA [13]. In contrast, the homology with m and f spinach chloroplast thioredoxins is only 30~o. Figure2 presents a multiple alignment [2] with some of the published thioredoxin sequences sorted by de- creasing homology with the tobacco sequence. Chlamydomonas [3, 4], rabbit [7], Anacystis [ 12], Corynebacterium [ 10] and m-type spinach 2 [ 11] sequences were obtained by protein sequencing. Human [16], rat [15], chicken [8], Escherichia coli [6], f-type [9] and m-type 1 [14] spinach sequences are derived from cDNA sequencing, showing the presence of a large signal peptide in the spinach thioredoxins. The tobacco sequence matches the mature protein sequences and has no signal peptide, suggesting that the protein encoded by the clone we have isolated is cytoplasmic.

Acknowledgement

This work was supported by the Minist~re de la Recherche Scientifique and by the L IMAGRAIN group.

The nucleotide sequence data reported will appear in the EMBL, GenBank and DDBJ Nucleotide Sequence Databases under the accession number X58527.

144

1 31 61

CGqrfAG'I T ~'i'ACGAGAAAC'fAAAGq'TAAGGG Cq"2GTIqaGGTq~qTI~AG E"f/~i,i.i.i~CI~ ~G~t- l.i,iG~

91

ATG GCT GCT AAC G2~

M A A N D

5

121

GC2 ACT TCA TCC GAG GAG GGA CAA GTG T~fC GGC TGC CAC ~AG G~

A T S S E E G Q V F G C H K V

I0 15 20

GAG GAA TC-G AAC

E E W N

151 181

GAG TAC %'WCAAG ~Jx_AGGC GTT GAG ACT AAG AAA ~ G-r/~ Gq'~ GTC GAT

E Y F K K G V E T K K L V V V D

25 30 35 40

211 241 q'I'rACT GCT TCATGG TGC C-GC CCTTGC CGT 'iu'i' A%'q? GCC CCAAq'q7 CIr9 GCT GAC Aq'rgGC'9

F T A S W ~ $ ~ ~ R F I A R I L A D I A

45 50 55 60

271 301

AAG ~AG ATG CCC CAT G%'T ATA Tq?C Cq~ ~ GTT GAT GTT GAT G~A CTG ~AG ACT GTT TCA

K K M P H V I F L K V D V D E L K T V S

65 70 75 80

331 361

GCG GAA TGG AGT GTG GAG GCA ATG CCA ACT ~ GTC q'TC ATT ~A GAT GGA ~A G~A Gqr~

A E W S V E A M P T F V F I K D G K E V

85 90 95 i00

391 421

GAC AGA GTT GT'9 GGT GCC ~AG AAA GAG GAG q'DG C~ CAG ACC ATA GTG ~AG CAT GCT GCT

D R V V G A K K E E L Q Q T I V K H A A

105 Ii0 115 120

451 481

CC]? GCT ACT GTC ACT GC'P TGAATCTCCI'W~.TCAA~GAEkgATATCCC~ZAq'Ir2AGTAGTA_q"I~Tt.'I'ioI'iG

P A T V T A

125

511 541 571 TAATAAC CAAGTAACTi~ ~qrP'9 CACACT~.TCACI~-T2~GGq["/~TAC"fATC CAC CATG'I'I'I'i'i'A_T'/G ~

601 631 661

TTGTGAAC CIrI~T CIrf~Eq~ CqrfC43AAT CTG~TG CATTACI~TGTAAGG ~ A ~ C C ~ ~ A ~ G

699

A CTAu'i'i'i "I'AGAuT'Ir9 CAAAA~.AAAA~AA~2~A

Fig. 1. Nucleotide sequence of the tobacco thioredoxin clone and the deduced amino acid sequence of the ORF. The PROSITE signature at amino acids 43-49 is underlined in bold type. The polyadenylation signals are underlined with a thin line.

Fig. 2. Multiple protein sequence alignment of tobacco thioredoxin with Chlamydomonas [3, 4], rabbit [7], human [16], rat [15],1~ chicken [8], Anacystis nidulans [12], Corynebacterium nephridii [10], Escherichia coli [6], f-spinach [9], m-spinach 1 [14], m-spinach 2 [ 11 ]. * indicates perfect homology between amino acids,: a conservative amino acid replacement. The arrows indicate the end of the signal peptide. In the consensus sequence! represents I or V, $ L or M, % F or Y, + B, D, E, N, Q or Z. The

thioredoxin signature defined in PROSITE [1] is underlined.

TOBACCO

CHLAMI

RABBIT

H T3M~2,]

RAT

CHICKEN

CHLAM2

AI~ACY

CORYNEF

ECOLI

FSPINA M~_LH LS LSH Q SWTS PAH P ITS SDP TRS SVPGTGLS R_RVDFLG S CK I NGVFVVKRKDRR-RM

MSPINAI

MSPIhTA2

consens

M~.IENCLQLSTSASVGTVAVKSHVYHLQPSSKLh~PSFRGLKRSFPALS

1 i0 20 30 40 50 60

145

1 i0 20 30 40 50

TOBACCO MA~2~ATSSEEGQVFGCHKVEEWNEYFKKGVETKKLVVVDFTASWCGPCRF

CHLAMI GG SVI VI D SKAAWDAQLAKGKEEHKP I VVD FTATWCGP CKM

RABBIT VKQIESKSAFQEVLDSAGDKLVVVDFSATWCGPCKM

HUMAN MVKQI ES KTAFQEALDAAGDKLVVVDFSATWCG P CKM

RAT MVKL I E S KEAFQE~_LAAAGDKLVVVD FSATWCG P CKIM *W : *: W *********************

CH I CKEN MVKSVG!~/_~ FEAE LK~G E KLVV-v-D F SATWC G P C KM : : W *** * **W**

CHLAM2 EAGAVNEETF~/VVLESSVPVLVDFWAPWCGPCRI * ** . *W*: *********************

ANACY SVAAAVTDATFKQEVLE S S I PVLVDFWAPWCG P CRM

CORYNEF MSATIVNTTDENFQADVLDAETP~VDFWAGWCAPCKA

ECOLI MLHQQRNQHARL I PVELYM~DKI I HLTDDSFDTDVLKADGAI LVDFWAEWCGPCKiM

FSPINA RGGEVRAS~QALGTQEIV~{EA I VGKVTEVI~KDTFWP I VKAAGDKPWLDMFTQWCG P CKA

w MSPINAI SSVSSSSPRQFRYSSVVCKASEAVKEVQDVNDSSWKEFVLESEVPVMVDFWAPWCGPCKL

MS P I I{A2 .<AS EAVKEVQDVI~S Gk~E FVLQS S E P SMVD FWAPWCG P CKL 1 i0 20 30 4O

consens + + +KP!VVDF A WCGPCI_q~

70 80 90 i00 ii0 120

146

60 70 80 90 i00 ii0 TOBACCO I AP I LAD I AKKIMP . HV I FLKVD'v-DELKTVSAEWSVEAMPTFVFI K_DGKE%~RVVGAKKEE

CHL/-~MI IAPLFETLSI~qDYAGKVI FLk"V]D~4qDAVAAVAE~.G I TAMPTFHVYKIDGVKADDLVGASQDK

RABBIT i KP FFHALS EKFN . ITTVF I EVDVDD C KD I AZ~ CEVKCMPTFQFFKKGQKVGE FS GAI{KEK

HUIV~ INPFFHSLSEKYS . NVI FLEVDVDDC QDVAS E CEVKCTPTFQ FFKKGQK-VGE FS G~I{KE K

RAT IKPFFHSLCDKYS . I/VVFLEVI)VDD CQDVAAD CEVKCMP TFQFYKKGQKVGE FS GANKEK

CHICKEN IKPFFHSLCDKFG.DVVFIEIDVDDAQDVATHCDVKCMPTFQFYKNGKKVQEFSG~,~KEK

CHL~ IAPVVDEIAGETKDKLKEVKLNTDESPNVASEYGIRSIPTIMVFKGGKKDETIIGAVPKA

ANACY VAPVVI]EIAQQYSDQVKVVKArNTDENPSVASQYGIRSIPTLMIFKDGQR\~TVVGAVPKT

CORYNEF I APVLEELSNEYAGKVK I VKI~VTS C EDTAVKYN I RNI F~LMFKDGEVVAQQVGAAPRS

ECOLI I AP I LDEI ADEYQGKLTVAKLNI DQNPGTAPKYG I RG I PTLLLF~qGEVAATKVG~S KG

FSPINA MAPKYEKLAEEYLDVIFLKLDCNQENKTLAKELGIRWPTFKILKENSWGEVTG~YDK

MSPINAI IAPVI DEI~EYSGK I AVYKLNTDEAPG I ATQYT~I RS I PTVLFF~GERKES I I GAVPKS * * * * @ * * * * * * * * * * W * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

MSPIIqA2 IAPVIDELAKEYSGKIAVTKLNTDEAPGIATQYNIRSIPTVLFFKNGERKESIIGDVSKY

50 60 70 80 90 i00

consens !AP! ++LA++Y V KV+V++ VA + +!R !PTF FFK+G+ V++ !GA +

130 140 150 160 170 18

120 126

TOBACCO LQQT I VELHAAPATVTA

CHI .~N1 LKALVA.KH~A

RABBIT LEATINELL

HUM~ LEATINELV

RAT L E A T I T E F A

CHICKEN LEETIKSLV

CHI.~M2 T I V Q T V E K Y L N

* : * l * *

~ACY TLANTLDKHL

C O R Y N E F K ~ . F I D Q N I

ECOLI QLKEFLD~7_A

FSPIOL LLEAIQAARSS W *

M S P I N A 1 T L T D S I E K Y L S P

N~ P I NA2 QL

consens L ! + 190 196

References

1. Bairoch A: PROSITE: a dictionary of protein sites and patterns. EMBL CD-ROM Release 6.0 (1990).

2. Corpet F: Multiple sequence alignment with hierarchical clustering. Nucl Acids Res 16:10881-10890 (1988).

3. Decottignies P, Schmitter JM, Jacquot JP, Dutka S, Picaud A, Gadal P: Purification, characterisation, and complete amino acid sequence of a thioredoxin from a green alga, Chlamydomonas reinhardtii. Arch Biochem Biophys 280:112-121 (1990).

4. Decottignies P, Schmitter JM, Dutka S, Jacquot JP, Miginiac-Maslow M: Characterisation and primary structure of a second thioredoxin from the green alga, Chlamydomonas reinhardtii. Eur J Biochem, in press.

5. Florencio FJ, Yee BC, Johnson TC, Buchanan BS: An NADPH thioredoxin system in leaves: purification and characterisation of NADP thioredoxin reductase and thioredoxin h from spinach. Arch Biochem Biophys 266: 496-507 (1988).

6. Holmgren A: Thioredoxin. Ann Rev Biochem 54: 237-271 (1985).

7. Johnson RS, Mathews WR, Biemann K, Hopper S: Amino acid sequence ofthioredoxin isolated from rabbit bone marrow determined by tandem mass spectrometry. J Biol Chem 263:9589-9597 (1988).

8. Jones SW, Luk KC: Isolation of a chicken thioredoxin cDNA clone: Thioredoxin mRNA is differentially expressed in normal and Rous sarcoma virus-trans- formed chicken embryo fibroblasts. J Mol Biol 203: 9687-9696 (1988).

9. Kamo M, Tsugita A, Wiessner C, Wedel N, Bartling D, Herrmann RG, Aguilar F, Gardet-Salvi L, Schuermann

147

P: Primary structure of spinach chloroplast thioredoxin F: protein sequencing and analysis of complete cDNA clones for spinach chloroplast thioredoxin F. Eur J Biochem 182:315-322 (1989).

10. Mac Farlan SC, Hogenkamp HPC, Eccleston ED, Howard JB, Fuchs JA: Purification, characterisation, and revised amino acid sequence of a second thioredoxin of Corynebacterium nephridii. Eur J Biochem 179: 389-398 (1989).

11. Maeda K, Tsugita A, Dalzoppo D, Vilbois F, Schurmann P: Further characterisation and amino acid sequence of m type thioredoxins from spinach chloroplasts. Eur J Biochem 154:197-203 (1986).

12. Muller EGD, Buchanan BB: Thioredoxin is essential for photosynthetic growth. The thioredoxin gene of Anacystis nidulans. J Biol Chem 264:4008-4014 (1989).

13. Pearson WR: Rapid and sensitive sequence comparison with FASTAP and FASTA: Methods Enzymol. 183: 63-93 (1990).

14. Wedel N, Clausmeyer S, Gardet-Salvi I, Herrman RG, Schuermann P: Nucleotide sequence of cDNAs encoding the entire precursor polypeptide for thioredoxin m from spinach chloroplasts. EMBL Data Library (unpub- lished).

15. Tonissen FK, Robin AJ, Wells JRE. Nucleotide sequence of a cDNA encoding rat thioredoxin. Nucl Acids Res 17:3973 (1989).

16. Wollman EE, d'Auriol L, Rimsky L, Shaw A, Jacquot J, Wingfield P, Graber P, Dessarps F, Robin P, Galibert F, Bertoglio J, Fradelizi D: Cloning and expression of a cDNA for human thioredoxin. J Biol Chem 263: 15506-15512 (1988).

Documents

Nucleotide sequence of a cDNA encoding a tobacco thioredoxin