8
BRIEF COMMUNICATION Structural and Transcriptional Characterization of rbcS Genes of Cotton (Gossypium hirsutum) Kumar Paritosh & Deepak Pental & Pradeep Kumar Burma # Springer Science+Business Media New York 2013 Abstract Ribulose bis-carboxylase/oxygenase small subunit genes (rbcS) are found to be highly expressed in most plant species. Promoters of rbcS genes have been isolated, characterized, and used to enhance expression of transgenes in transgenic lines. In this study, we have iden- tified five different members of the rbcS gene family in cotton (Gossypium hirsutum). All five genes contain three exons and two introns. Nucleotide sequences of the rbcS genes are conserved in their coding regions, but exhibit considerable sequence divergence in both introns and in upstream and downstream flanking sequences. Based on sequence identity, these genes are grouped into two sub- families. Promoters of all rbcS genes have been isolated, and their cis-acting motifs, exhibiting different arrangements, have been identified. All five rbcS genes are differentially expressed in leaf and boll tissues of cotton plants at different stages of plant development. Three of the five isolated rbcS genes are found to be highly expressed in analyzed tissues. Multiple transcriptional start sites of all members of the gene family have been identified in leaf and boll tissues of cotton plants. Moreover, different members of the rbcS gene family from the two progenitor diploid species of G. hirsutum, namely, Gossypium herbaceum and Gossypium raimondii, have also been isolated to identify the lineage of the five rbcS genes. In addition, a total of three candidate promoters that can be used to enhance expression of transgenes in cotton have been identified. Keywords Gossypium hirsutum . rbcS . Rubisco . rbcS gene family Introduction The ribulose bis-carboxylase/oxygenase (Rubisco) protein is one of the most abundant proteins present in green parts of plants. Structure and expression of the genes encoding the small subunit of the Rubisco protein (rbcS) have been studied in many higher plant species such as Arabidopsis (Krebbers et al. 1988), tomato (Sugita et al. 1987), pea (Fluhr et al. 1986), and wheat (Sasanuma 2001; Sasanuma and Miyashita 1998). The rbcS genes are present as a multigene family, the number of which varies from 4 genes in Arabidopsis thaliana (Krebbers et al. 1988) to 12 genes in wheat (Sasanuma 2001). Differential expression patterns have been observed among the members of the rbcS gene family in different plant species. Although most of the rbcS genes are expressed in the leaf tissues, in some species, the level of expression varies among the different members of the family (Dean et al. 1985; Sugita et al. 1987; Tumer et al. 1986). Further, in some instances like that of tomato, two of the five identified genes, namely rbcS1 and rbcS2, have been shown to express highly in the developing fruits (Wanner and Gruissem 1991). Promoters of these two genes have been characterized as fruit specific and have been used to achieve predominantly fruit-specific expressions of reporter genes in the tomato plants (Meier et al. 1995). Promoters of the rbcS gene from different plant species, e.g., ats1A of A. thaliana (Almeida et al. 1989) and rbcS2 of Brassica rapa (Anisimov Electronic supplementary material The online version of this article (doi:10.1007/s11105-013-0576-1) contains supplementary material, which is available to authorized users. K. Paritosh : D. Pental Centre for Genetic Manipulation of Crop Plants, University of Delhi South Campus, Benito Juarez Road, New Delhi 110021, India K. Paritosh : D. Pental : P. K. Burma (*) Department of Genetics, University of Delhi South Campus, Benito Juarez Road, New Delhi 110021, India e-mail: [email protected] Plant Mol Biol Rep DOI 10.1007/s11105-013-0576-1

Structural and Transcriptional Characterization of rbcS Genes of Cotton (Gossypium hirsutum)

Embed Size (px)

Citation preview

Page 1: Structural and Transcriptional Characterization of rbcS Genes of Cotton (Gossypium hirsutum)

BRIEF COMMUNICATION

Structural and Transcriptional Characterizationof rbcS Genes of Cotton (Gossypium hirsutum)

Kumar Paritosh & Deepak Pental &Pradeep Kumar Burma

# Springer Science+Business Media New York 2013

Abstract Ribulose bis-carboxylase/oxygenase smallsubunit genes (rbcS) are found to be highly expressed inmost plant species. Promoters of rbcS genes have beenisolated, characterized, and used to enhance expression oftransgenes in transgenic lines. In this study, we have iden-tified five different members of the rbcS gene family incotton (Gossypium hirsutum). All five genes contain threeexons and two introns. Nucleotide sequences of the rbcSgenes are conserved in their coding regions, but exhibitconsiderable sequence divergence in both introns and inupstream and downstream flanking sequences. Based onsequence identity, these genes are grouped into two sub-families. Promoters of all rbcS genes have been isolated, andtheir cis-acting motifs, exhibiting different arrangements,have been identified. All five rbcS genes are differentiallyexpressed in leaf and boll tissues of cotton plants at differentstages of plant development. Three of the five isolated rbcSgenes are found to be highly expressed in analyzed tissues.Multiple transcriptional start sites of all members of the genefamily have been identified in leaf and boll tissues of cottonplants. Moreover, different members of the rbcS gene familyfrom the two progenitor diploid species of G. hirsutum,namely, Gossypium herbaceum and Gossypium raimondii,

have also been isolated to identify the lineage of the fiverbcS genes. In addition, a total of three candidate promotersthat can be used to enhance expression of transgenes incotton have been identified.

Keywords Gossypiumhirsutum . rbcS .Rubisco . rbcSgenefamily

Introduction

The ribulose bis-carboxylase/oxygenase (Rubisco) proteinis one of the most abundant proteins present in green partsof plants. Structure and expression of the genes encoding thesmall subunit of the Rubisco protein (rbcS) have beenstudied in many higher plant species such as Arabidopsis(Krebbers et al. 1988), tomato (Sugita et al. 1987), pea(Fluhr et al. 1986), and wheat (Sasanuma 2001; Sasanumaand Miyashita 1998). The rbcS genes are present as amultigene family, the number of which varies from 4 genesin Arabidopsis thaliana (Krebbers et al. 1988) to 12 genes inwheat (Sasanuma 2001). Differential expression patternshave been observed among the members of the rbcS genefamily in different plant species. Although most of the rbcSgenes are expressed in the leaf tissues, in some species, thelevel of expression varies among the different members ofthe family (Dean et al. 1985; Sugita et al. 1987; Tumer et al.1986). Further, in some instances like that of tomato, two ofthe five identified genes, namely rbcS1 and rbcS2, have beenshown to express highly in the developing fruits (Wanner andGruissem 1991). Promoters of these two genes have beencharacterized as fruit specific and have been used to achievepredominantly fruit-specific expressions of reporter genes inthe tomato plants (Meier et al. 1995). Promoters of the rbcSgene from different plant species, e.g., ats1A of A. thaliana(Almeida et al. 1989) and rbcS2 of Brassica rapa (Anisimov

Electronic supplementary material The online version of this article(doi:10.1007/s11105-013-0576-1) contains supplementary material,which is available to authorized users.

K. Paritosh :D. PentalCentre for Genetic Manipulation of Crop Plants,University of Delhi South Campus, Benito Juarez Road,New Delhi 110021, India

K. Paritosh :D. Pental : P. K. Burma (*)Department of Genetics, University of Delhi South Campus,Benito Juarez Road,New Delhi 110021, Indiae-mail: [email protected]

Plant Mol Biol RepDOI 10.1007/s11105-013-0576-1

Page 2: Structural and Transcriptional Characterization of rbcS Genes of Cotton (Gossypium hirsutum)

et al. 2007), have been shown to be stronger than the CaMV35S promoter.

The rbcS gene family in cotton (Gossypium hirsutum)has not been fully described. Only one member of the rbcSgene family has been reported from G. hirsutum(Amarasinghe et al. 2006; Sagliocco et al. 1991). The tran-script level of this gene has been shown to be high in the leaftissue at different stages of plant life. Further, it has beenshown that while the 521-bp promoter region showed weak-er activity than the CaMV 35S promoter (Song et al. 2000),the full-length 1.8 kb was stronger than the CaMV 35Spromoter (Amarasinghe et al. 2006).

The present study was undertaken to isolate the rbcSgenes and their promoters present in G. hirsutum. Theirexpression pattern was analyzed in the leaf and rinds ofthe boll at different stages of plant development. We reportthat there are five functional rbcS genes in cotton, which canbe classified into two subfamilies based on sequence ho-mology. G. hirsutum (AADD) is an allotetraploid species,with Gossypium herbaceum (AA) and Gossypium raimondii(DD) being the diploid progenitors (Brubaker et al. 1999).Based on characterization of rbcS genes in the two progen-itor species, G. herbaceum and G. raimondii, we identifiedthe lineage of the five rbcS genes of G. hirsutum. Further,the promoters of each of the five genes have been isolated,and transcription levels of the five genes have been studiedin the leaf and boll rind tissue.

Materials and Methods

Plant Material, Isolation of DNA

G. hirsutum L Coker 310 FR (Kumar et al. 1998), G.herbaceum var. africanum, and G. raimondii were used asplant material in the study. Plants were grown under green-house conditions at D/N temperature of 32/22 °C and ap-proximate relative humidity of 70 %.

DNA was isolated using a modified CTAB method(Rawat et al. 2011) and further purified using DNeasy PlantMaxi Kit (Qiagen). For genome walking experiments, DNAwas purified by CsCl density gradient centrifugation.

Design of Degenerate Primers and Isolation of PartialGenomic Sequences

The nucleotide sequences of rbcS genes from different plantspecies (Table S1), including the sequence reported earlierfrom cotton, were compared for their sequence divergence.The analysis revealed two distinct clusters, viz. I and II (Fig.S1). The reported sequence of the rbcS gene from cotton fellin cluster I. Sequences from cluster I were analyzed todesign degenerate primers from the maximally conserved

region among these sequences. The forward (DF5) andreverse (DR3) primers (Table S2), located in exons 1 and3, respectively, were used for amplification of genomicDNA regions from G. hirsutum. The amplified fragmentswere cloned in pGEM-T Easy vector (Promega) andsequenced.

Cloning of 5′ and 3′ Regions of rbcS Genes by GenomeWalking

For genome walking, seven libraries of DNA from G.hirsutum were constructed using the Genome Walker Kit(Clontech) as per manufacturer’s protocol. The librarieswere created using seven different restriction enzymes,namely DraI, EcoRV, HincII, SspI, ScaI, MscI, and XmnIand ligated to a universal adaptor. The 5′ and 3′ regions ofthe different members of the rbcS genes were amplifiedusing a series of nested adaptor and gene-specific primers(Table S2). Amplifications were carried out using Takara HSpolymerase (Takara). The amplified fragments were re-solved on agarose gel, eluted and cloned in pGEM-T Easyvector, and sequenced. Final sequences were derived fromthree independent PCR reactions.

RNA Isolation and Amplification of Coding Region of rbcSGenes

Total RNA from leaf and boll tissue samples was iso-lated using the Spectrum Plant Total RNA Kit (Sigma),following the manufacturer’s instructions. ContaminatingDNA was removed from the RNA using the DNaseA kit(Ambion). First-strand cDNA was synthesized from1 μg of total RNA using the cDNA Archive kit (PerkinElmer), using poly dT primers following the manufac-turer’s instructions. The cDNA was used for the ampli-fication of rbcS genes using primers (Table S2)designed on the basis of the delineated genomic DNAsequences. The amplified fragments were cloned inpGEM-T Easy vector and sequenced. Different membersof the family were identified based on gene-specificSNPs.

Identification of Transcriptional Initiation Sites by 5′ RLM-RACE

Transcriptional initiation sites (TIS) were determined bycarrying out 5′ RACE using the FirstChoice RNA ligase-mediated amplification of cDNA ends (RLM-RACE) kit(Ambion). Two nested reverse primers specific to the codingregion of each of the five rbcS genes (Table S2) were usedfor the 5′ RACE amplification. The amplified fragmentswere cloned in pGEM-T Easy vector and sequenced. Thetranscription start sites of the rbcS genes were defined by

Plant Mol Biol Rep

Page 3: Structural and Transcriptional Characterization of rbcS Genes of Cotton (Gossypium hirsutum)

aligning the obtained sequences with the full-length rbcSgene sequences.

Expression Analysis of the rbcS Genes in the Leavesand Boll of Cotton Plant

Total RNA was isolated from leaf and boll tissue samples,and first-strand cDNA was synthesized as described in theabove section. Intron-spanning primer sets were designedfor each of the rbcS genes (Table S2), and the genes wereamplified from the cDNA pool. The PCR amplificationswere performed with 2 μl of the synthesized cDNA in areaction volume of 25 μl. The reaction mix comprised of200 μM dNTPs, 12.5 pmol of specific primers, 2.5 U of Taqpolymerase, and 1× Taq buffer. The PCR conditions usedwere as follows: initiation with a 5-min denaturation at95 °C, followed by different amplification cycles of 30 s at95 °C, 30 s at 65 °C, and 1 min at 72 °C. Reactions wereterminated after 22, 23, and 24 cycles of synthesis so as tostudy levels of expression during the exponential phase ofamplification. The ubiquitin gene was taken as the internalcontrol in each reaction. Amplified products were resolvedon 2 % agarose gels.

Results and Discussion

Identification and Isolation of rbcS Genes from G. hirsutum

A 600-bp fragment was obtained on amplification of geno-mic DNA with a set of degenerate primers (Table S2)designed from the conserved region of known rbcS genes.Five genes, including the earlier reported rbcS gene fromcotton (Sagliocco et al. 1991), were identified based on thesequencing of 28 clones obtained from three independentPCRs. Based on the variations that could distinguish be-tween the five different sequences, primers were designed toisolate the upstream and the downstream regions of eachgene fragment using seven different Genome Walking Li-braries. The obtained amplicons from multiple walk reac-tions were cloned and sequenced. Gene sequences wereassembled through at least three independent PCRs to dis-cern any Taq polymerase-based mutation, and each genewas identified on the basis of sequence variations. Theobtained sequences for each of the five rbcS genes weresubmitted to GenBank under accessions numbersJN608783, JN608788, JN608790, JN608791, andJN608792.

Analysis of sequence similarity showed the presence oftwo subfamilies of rbcS genes in G. hirsutum. Subfamily 1consisted of two genes showing >96 % sequence identity,while three genes clustered as members of subfamily 2 show97–98 % sequence identity among themselves. Sequence

comparison between the members of the two subfamiliesshowed sequence identity of around 87 %. Sequences 1 and2 of subfamily 1 were named rbcS1a and rbcS1b, respec-tively, while the members of subfamily 2 were named asrbcS2a, rbcS2b, and rbcS2c.

Several groups (e.g., Rong et al. 2004; Shi et al. 2006;Udall et al. 2006; Taliercio and Boykin 2007; Pang et al.2012) have deposited expressed tag sequences (ESTs) iden-tified from different tissues of G. hirsutum in databases likeGenBank at NCBI (http://ncbi.ntm.nih.gov/nucest) and“The DFCI Gossypium Gene Index (CGI)” (http://compbio.dfci.harvard.edu/tgi/cgi-bin/tgi/gimain.pl?gudb=cotton). While most of the early work was based on se-quencing cDNA libraries, the recent paper by Pang et al.(2012) carried out the transcriptome analysis using 454pyrosequencing technology. A total of 524,125 sequenceswere available in these repositories. In order to confirmwhether there were only five genes of the rbcS family asidentified in this study, we carried out a Blast search in theEST dataset mentioned above. The results showed that onlyfive copies were present as no new sequences showingsimilarity to the reported rbcS gene sequences were ob-served. Also, in a recent study on the evolution of allopoly-ploids using cotton Rubisco, Gong et al. (2012) alsosuggested the presence of five copies of the rbcS gene.

The copy number of the genes in cotton was furtherconfirmed by DNA blot analysis. The hybridization profileconfirmed that the number of rbcS genes in all probabilitydoes not exceed five (data not shown).

The Structure of Cotton rbcS Genes

Analysis of cDNA clones also showed the presence of fivedifferent transcripts corresponding to the five members ofthe rbcS genes. A comparison of the coding regions of thecDNA and genomic DNA of the five rbcS genes indicatedthat the genes had two introns and three exons (Fig. 1).Members of subfamily 1 contained an open reading frame(ORF) of 549 bp, encoding a putative polypeptide of 182amino acids. Members of subfamily 2 consisted of an ORFof 540 bp, encoding a polypeptide of 179 amino acids. Thedifference in the sizes of the gene sequences were due to thepresence of a deletion of 9 bp in the first exon of themembers of subfamily 2, as compared to the members ofsubfamily 1. The sequence of the coding region of themembers belonging to one subfamily was more conserved(>97 %) to each other than that of the members of the othersubfamily. This was also found to be true for the intronicregions.

All the rbcS proteins encode for a transit peptide regionand a mature protein region. The transit peptide region ofthe rbcS protein was encoded by 59 amino acids in themembers of subfamily 1 and by 56 amino acids in the

Plant Mol Biol Rep

Page 4: Structural and Transcriptional Characterization of rbcS Genes of Cotton (Gossypium hirsutum)

members of subfamily 2, while the mature protein was 123amino acids long in all the five members. The transit peptidewas encoded by exon 1, while the rest of the protein regionwas encoded by exons 2 and 3.

The amino acid sequences of the proteins encoded byall the five genes were highly conserved with sequencehomology ranging from 95 to 97.8 %. All the fivegenes clustered together and followed the overall phy-logenetic position of some of the dicot families, viz.Solanaceae, Malvaceae, Brassicaceae, and Fabaceae thathave been studied extensively for rbcS-encoded proteins(Fig. S2).

Determination of Transcription Initiation Sites by RNALigase-Mediated Amplification of 5′ cDNA Ends

TIS of the five rbcS genes were identified byperforming RLM-RACE reactions on the total RNAisolated from cotton leaf and boll tissues separately.RLM-RACE allows the mapping of the transcriptionstart sites only from the capped 5′ ends of mRNAsand not from the degraded mRNAs.

A single band was obtained for each gene when the PCRproduct was checked on 1 % agarose gel. Cloning andsequencing of the PCR products from 184 clones obtainedfrom three independent PCR reactions revealed multipletranscription start sites for each of the five members(Fig. 2). Each clone had the adaptor sequence precedingthe rbcS gene sequence, thus representing authentic TIS.All the clones matched their respective genes.

In the case of rbcS1a and rbcS1b, three TIS clusteredbetween the 60- to 68-bp region upstream to ATG wereidentified in transcripts from leaf tissue. An additional TIS,11 bp downstream the main cluster, was also observed in thecase of the rbcS1b gene. In the case of transcripts from theboll tissue, five TIS in the case of rbcS1a and four in thecase rbcS1b were observed. Similarly, multiple TIS werealso identified in the case of the genes rbcS2a, rbcS2b, andrbcS2c in both tissues except rbcS2b, where only one TISwas observed in the boll tissue. TIS of subfamily 2 werefound to be clustered closer to the ATG in comparison tomembers of subfamily 1 (Fig. 2). In the available literature,a single TIS has been reported in the rbcS gene family

members in Arabidopsis (Krebbers et al. 1988), tobacco(O'Neal et al. 1987), and pea (Fluhr et al. 1986). However,in the case of Petunia (Dean et al. 1987), multiple TIS wereobserved. As some of the earlier reports identified TISmainly by carrying out S1 nuclease protection or by primerextension assays, only the major TIS might have beenidentified in the earlier papers.

Promoters of the rbcS Genes

Approximately 1.5 kb of 5′ upstream regions of theidentified genes were isolated by genome walkingmethods. The upstream region of rbcS1a and rbcS1bshowed 89.9 % sequence similarity between them. How-ever, members of subfamily 2 showed a lower level ofsequence similarity. While rbcS2b and rbcS2c showed65 % similarity between themselves, they only had11 % similarity with rbcS2a. Thus, the promoter regionsequence of rbcS2a was found to be divergent from theother members of subfamily 2.

The promoter regions of the five rbcS genes wereanalyzed for the presence of different cis-elements usingPLACE (Higo et al. 1999), PlantCARE (Lescot et al.2002), and Softberry (www.softberry.com.) databases. Anumber of putative regulatory motifs corresponding tothe known cis-elements of rbcS genes were found. Therelative positions of the identified cis-elements in thedifferent promoters are shown in Fig. 3. The placementof cis-elements in relation to each other has been shown tobe important for their function. Many light-responsive mo-tifs such as I box, G box, GT1 box, rbcS box II, rbcS boxIII, etc. were observed at relatively the same position in thepromoter regions of genes rbcS1a and rbcS1b. Severalcommon blocks of cis-elements have been proposed, basedon the analysis of promoters of different rbcS genes(Arguello-Astorga and Herrera-Estrella 1996). We observedthe presence of two such blocks (CMA 4 and CMA 5) ataround the −250-bp region in both rbcS1a and rbcS1bpromoters. The CMA 5 is a combination of I-G-I boxes.The box was found to be in close proximity to someidentified functional motifs such as the “S box” (Marti-nez-Hernandez et al. 2002) and the 1bAM5 (Lopez-Ochoaet al. 2007) motif. The cis-elements were found to be

Fig. 1 Schematic representation of the region encoding the protein of the members of the two rbcS subfamilies. Exons and introns are shown by grayboxes and black lines, respectively. The length of each segment is shown above the region. Deletions are indicated by black triangles

Plant Mol Biol Rep

Page 5: Structural and Transcriptional Characterization of rbcS Genes of Cotton (Gossypium hirsutum)

abundant up to the −800-bp region in promoters of sub-family 1, after which the relative abundance of the motifsdecreased (Fig. 3). Several matrix-associated regions wereobserved around the −1,500-bp region of the promoters ofrbcS1a and rbcS1b.

The promoter region of rbcS2b and rbcS2c genes had verydivergent organizations compared to the promoters of sub-family 1. These promoters had fewer number of known lightmotifs in the proximal region of the promoter (−360-bp re-gion). Only the GT1, I box, rbcS box II, and tandem GATArepeats were found in this region. These boxes confer lightresponsiveness and transcriptional strength on the promoter(Castresana et al. 1988). The two promoters show 94.3 %identity in the −600-bp region, after which they have divergedconsiderably. The rbcS2b promoter has a deletion of 220 bp inthis region in comparison to the promoter of the rbcS2c gene.

The organization of rbcS2a promoter has diverged fromthe promoter organization of the other two members ofsubfamily 2 (rbcS2b and rbcS2c) and interestingly showssome features of the promoters of subfamily 1(Fig. 3). Wealso observed the presence of one CMA 5-like block pattern(I-G-I) in the upstream promoter region of rbcS2a, which isnot present in other members of subfamily 2. The positionsof other cis-elements were also found to be different in the

promoter region of rbcS2a as compared to that of the othermembers of the gene family.

Expression of the Members of rbcS Gene Family

Relative expression of the rbcS genes were checked inleaves from 33-, 45-, 60-, and 109-day-old plants as wellas the upper green rind portion of boll tissues at twodifferent developmental stages (5 and 10 days post anthe-sis, DPA) by semi-quantitative PCR analysis (Fig. 4). Allthe five members of the rbcS gene family were found tobe active in the analyzed parts of the host plant at differ-ent developmental stages. Transcript levels of all the rbcSgenes were found to be higher in the young leaf samplesand gradually decreased as the leaves matured. Transcriptlevels of genes rbcS1a, rbcS1b, and rbcS2b were found tobe relatively higher than the levels of the genes rbcS2aand rbcS2c in leaf and boll tissues at different stages ofdevelopment.

We also analyzed the EST datasets for G. hirsutum indifferent studies for the presence of the transcripts of the fivemembers. It was observed that all five members wereexpressed in the leaf (Rong et al. 2004; Udall et al. 2006),meristematic regions, young fibers, roots, and stem (Taliercio

Fig. 2 Identification of the transcription initiation sites of the rbcSgenes. TIS were identified by 5′ RLM-RACE experiments. The loca-tion of the TIS as identified in transcripts from leaves and boll tissue is

indicated by arrows above and below the sequence, respectively.Numbers in parentheses correspond to the number of clones observedfor each of the corresponding TIS

Plant Mol Biol Rep

Page 6: Structural and Transcriptional Characterization of rbcS Genes of Cotton (Gossypium hirsutum)

and Boykin 2007) and from ovules harvested 5 to 10 DPA(Shi et al. 2006). In the data generated in the work by Pang et

al. (2012) based on 10-DPA ovaries, we could identify tran-scripts for only one of the members, i.e., rbcS1a.

Fig. 3 Placement of the known cis-elements in the promoter regions ofthe members of the rbcS gene family from G. hirsutum. Regions ofrbcS2a promoter which are similar to rbcS1a and rbcS2c are shownwith red and blue lines, respectively. A region of deletion in rbcS2b

promoter as compared to the rbcS2c promoter is marked with a violettriangle. Regions which show significant similarity between therbcS2b and 2c promoter regions are shown by green bars

Fig. 4 A comparative analysis of the expression of rbcS genes indifferent tissue samples of cotton by semi-quantitative RT-PCR analysis.Amplification obtained for each member of the rbcS family after 22, 23,

and 24 cycles of PCR has been shown. The amplicons were sequenced tocheck its authenticity. The expression of ubiquitin gene was used as aninternal control. The lanes marked M represent the size marker

Plant Mol Biol Rep

Page 7: Structural and Transcriptional Characterization of rbcS Genes of Cotton (Gossypium hirsutum)

Identification and Isolation of the Homologs of rbcS Genesfrom the Progenitor Diploid Species

To isolate rbcS genes from the diploid progenitor species,namely G. herbaceum (AA) and G. raimondii (DD), thegenomic DNA of the two species was amplified with thedegenerate primers used earlier with G. hirsutum. A singleband was obtained from the reaction, which was clonedand sequenced. After analyzing the sequences of 41clones, homologs of the rbcS gene family members wereidentified in the parental genomes. Homologs of rbcS1a,rbcS2a, and rbcS2c were indentified in G. herbaceum,and homologs of rbcS1b and rbcS2b were present in G.raimondii. Thus, members of both the rbcS gene subfam-ilies were detected in the two diploid progenitor species.The obtained sequences for each of the five rbcS geneswere submitted to GenBank under accessions numbersJN608784, JN608785, JN608786, JN608787, andJN608789. Lengths and sequences of the isolated geneswere found to be conserved in the diploid and tetraploidspecies, except in the rbcS2b where a deletion of 22 bpwas found in intron 1 as compared to its homolog in thetetraploid species.

Gene-specific primers from the tetraploid species wereused to isolate at least 500-bp promoter regions of the rbcSgene family members. The promoter regions of the rbcSgenes were conserved among the homologous genes isolat-ed from the three plant species. Interestingly, the promoterof the rbcS2a gene, which varies the most from the otherrbcS genes, was found to be conserved in the diploid and thepolyploid cotton species. Therefore, the divergence of therbcS2a promoter from other rbcS promoters predates theorigin of the tetraploid G. hirsutum.

Our study has described in detail the rbcS genes of G.hirsutum and its two dipoid progenitors G. raimondii and G.hirsutum. All the five rbcS genes in G. hirsutum were foundto express in the green parts of the plant, albeit at varyinglevels of abundance. Three out of the five rbcS genescharacterized in the study can be used for driving high levelsof transgene expression in the green tissues of the cottonplant. The described promoters seem to be of potential valuein driving transgene expression in the green parts of thecotton plant including the young bolls, which are majortargets of lepidopteron pests.

Acknowledgments We acknowledge the help of Dr. K. R. Kranthi,Director, Central Institute for Cotton Research, Nagpur, India, forproviding the diploid cotton varieties for the work. This work wassupported by a grant-in-aid from the Department of Biotechnology,Government of India, and the University of Delhi. KP also acknowl-edges the fellowship from the Council for Scientific and IndustrialResearch, India. The authors acknowledge the help of Dr. Anjana N.Dev in editing the manuscript.

References

Almeida ERPD, Gossele V, Muller CG, Dockx J, Reynaerts A,Botterman J, Krebbers E, Timko MP (1989) Transgenic expres-sion of two marker genes under the control of an Arabidopsis rbcSpromoter: sequences encoding the Rubisco transit peptide in-crease expression levels. Mol Gen Genet 218:78–86

Amarasinghe BHRR, Faivre-Nitschke E, Wu Y, Udall JA, Dennis ES,Constable G, Llewellyn DJ (2006) Genomic approaches to thediscovery of promoters for sustained expression in cotton(Gossypium hirsutum L.) under field conditions: expression anal-ysis in transgenic cotton and Arabidopsis of a Rubisco smallsubunit promoter identified using EST sequence analysis andcDNA microarrays. Plant Biotechnol 23:437–450

Anisimov A, Koivu K, Kanerva A, Kaijalainen S, Juntunen K,Kuvshinov V (2007) Cloning of new rubisco promoters fromBrassica rapa and determination of their activity in stablytransformed Brassica napus and Nicotiana tabacum plants. MolBreeding 19:241–253. doi:10.1007/s11032-006-9059-5

Arguello-Astorga GR, Herrera-Estrella LR (1996) Ancestral multipar-tite units in light-responsive plant promoters have structural fea-tures correlating with specific phototransduction pathways. PlantPhysiol 112:1151–1166. doi:10.1104/pp.112.3.1151

Brubaker CL, Bourland FM, Wendel JF (1999) The origin and domes-tication of cotton. In: Smith CW, Cotheren JT (eds) Cotton origin,history, technology and production. Wiley, New York, pp 3–31

Castresana C, Garcia-Luque I, Alonso E, Malik VS, Cashmore AR(1988) Both positive and negative regulatory elements mediateexpression of a photoregulated CAB gene from Nicotianaplumbaginifolia. EMBO J 7:1929–1936

Dean C, Elzen P, Tamaki S, Dunsmuir P, Bedbrook J (1985) Differen-tial expression of the eight genes of the petunia ribulosebisphosphate carboxylase small subunit multi-gene family.EMBO J 4:3055–3061

Dean C, Favreau M, Dunsmuir P, Bedbrook J (1987) Confirmation ofthe relative expression levels of the Petunia (Mitchell) rbcS genes.Nucleic Acids Res 15:4655–4668. doi:10.1093/nar/15.11.4645

Fluhr R, Moses P, Morelli G, Coruzzi G, Chua NH (1986) Expressiondynamics of the pea rbcS multigene family and organ distributionof the transcripts. EMBO J 5:2063–2071

Gong L, Salmon A, Yoo M-J, Grupp KK, Wang Z, Paterson AH,Wendel JF (2012) The cytonuclear dimension of allopolyploidevolution: an example from cotton using Rubisco. Mol Biol Evol29(10):3023–3036. doi:10.1093/molbev/mss110

Higo K, Ugawa Y, Iwamoto M, Korenaga T (1999) Plant cis-actingregulatory DNA elements (PLACE) database: 1999. NucleicAcids Res 27:297–300. doi:10.1093/nar/27.1.297

Krebbers E, Seurinck J, Herdies L, Cashmore AR, Timko MP (1988)Four genes in two diverged subfamilies encode the ribulose-l,5-bisphosphate carboxylase small subunit polypeptides ofArabidopsis thaliana. Plant Mol Biol 11:745–759

Kumar S, Sharma P, Pental D (1998) A genetic approach to in vitroregeneration of non-regenerating cotton (Gossypium hirsutum L.)cultivars. Plant Cell Rep 18:59–63

Lescot M, Dehais P, Thijs G, Marchal K, Moreau Y, Van de Peer Y,Rouze P, Rombauts S (2002) PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silicoanalysis of promoter sequences. Nucleic Acids Res 30:325–327.doi:10.1093/nar/30.1.325

Lopez-Ochoa L, Acevedo-Hernandez G, Martinez-Hernandez A,Arguello-Astorga G, Herrera-Estrella L (2007) Structural relation-ships between diverse cis-acting elements are critical for thefunctional properties of a rbcS minimal light regulatory unit. JExp Bot 58:4397–4406. doi:10.1093/jxb/erm307

Plant Mol Biol Rep

Page 8: Structural and Transcriptional Characterization of rbcS Genes of Cotton (Gossypium hirsutum)

Martinez-Hernandez A, Lopez-Ochoa L, Arguello-Astorga G, Herrera-Estrella L (2002) Functional properties and regulatory complexityof a minimal RBCS light-responsive unit activated by phyto-chrome, cryptochrome, and plastid signals. Plant Physiol128:1223–1233. doi:10.1104/pp.010678

Meier I, Callan KL, Fleming AJ, Gruissem W (1995) Organ-specificdifferential regulation of a promoter subfamily for the ribulose-1,5-bisphosphate carboxylase/oxygenase small subunit genes intomato. Plant Physiol 107:1105–1118. doi:10.1104/pp.107.4.1105

O'Neal JK, PokalskyAR, KiehneKL, Shewmaker CK (1987) Isolation oftobacco SSU genes: characterization of a transcriptionally activepseudogene. Nucleic Acids Res 15:8661–8677. doi:10.1093/nar/15.2.8661

Pang M, Percy RG, Stewart JMD, Hughs E, Zhang J (2012) Compar-ative transcriptome analysis of Pima and Acala cotton during bolldevelopment using 454 pyrosequencing technology. Mol Breed-ing 30:1143–1153. doi:10.1007/s11032-012-9702-2

Rawat P, Singh AK, Ray K, Chaudhary B, Kumar S, Gautam T,Kanoria S, Kaur G, Kumar P, Pental D, Burma PK (2011) Detri-mental effect of expression of Bt endotoxin Cry1Ac on in vitroregeneration, in vivo growth and development of tobacco andcotton transgenics. J Biosci 36:363–376

Rong J, Abbey C, Bowers JE et al (2004) A 3347-locus genetic recom-bination map od sequence-tagged sites reveals features of genomeorganization, transmission and evolutionof cotton (Gossypium). Ge-netics 166:389–417. doi:10.1534/genetics.166.1.389

Sagliocco F, Kapazoglou A, Dure L 3rd (1991) Sequence of an rbcSgene from cotton. Plant Mol Biol 17:1275–1276

Sasanuma T (2001) Characterization of the rbcS multigene family inwheat: subfamily classification, determination of chromosomal

location and evolutionary analysis. Mol Genet Genomics265:161–171. doi:10.1007/s04380000404

Sasanuma T, Miyashita NT (1998) Subfamily divergence in themultigene family of ribulose-1,5-bisphosphate carboxylase/oxy-genase (rbcS) in Triticeae and its relatives. Genes Genet Syst73:297–309

Shi Y-H, Zhu S-W, Mao X-Z et al (2006) Transcriptome profiling,molecular biological, and physiological studies reveal a majorrole for ethylene in cotton fiber cell elongation. Plant Cell18:651–664. doi:10.1105/tpc.105.040303

Song P, Heinen JL, Burns TH, Allen RD (2000) Expression of twotissue-specific promoters in transgenic cotton plants. J Cott Sci4:217–223

Sugita M, Manzara T, Pichersky E, Cashmore A, Gruissem W (1987)Genomic organization, sequence analysis and expression of allfive genes encoding the small subunit of ribulose-1,5-bisphosphate carboxylase/oxygenase from tomato. Mol Gen Gen-et 209:247–256

Taliercio EW, Boykin (2007) Analysis of gene expression in cottonfibre initials. BMC Plant Biol 7:22. doi:10.1186/1471-2229-7-22

Tumer NE, Clark WG, Tabor GJ, Hironaka CM, Fraley RT, Shah DM(1986) The genes encoding the small subunit of ribulose-1,5-bisphosphate carboxylase are expressed differentially in petunialeaves. Nucleic Acids Res 14:3325–3342. doi:10.1093/nar/14.8.3325

Udall JA, Swanson JM, Haller K et al (2006) A global assembly ofcotton ESTs. Genome Res 16:441–450. doi:10.1101/gr.4602906

Wanner LA, Gruissem W (1991) Expression dynamics of the tomatorbcS gene family during development. Plant Cell 3:1289–1303.doi:10.1105/tpc.3.12.1289

Plant Mol Biol Rep