19
Gene name L primer R primer Branch point protein GATCATCTCTCCGGGTTCAA GCTCACTTTGCCAAATCCAT bZIP transcription factor CAAGCCCCTTTTAGGGATCT GGAGATCATCAAGCCACGAT Auxin response factor CCCTTGGTTGTGTTCAAAGC CTGCCATGCTCCATACCATA Auxin-repressed protein, Drm CAAACTCCGGAAACACCCTA TTACCGGAAAACGGAGACAC embryonic protein ACCAAGGCATGAGTGTCTCC AGCTCTGCTCCAATCACTCC Extensin TCCACCTCCATCTCCATCTC CGACCTCGTCTGAATCACAA Glycine-rich RNA-bining protein GTTGAATACCGGTGCTTCGT GGCATCGTTCATGGATTTCT Malate dehydrogenase ATGCCTCCATGGTTTGAAAG CAACAGCTAAGCTATGGATAACACA Splicing factor TTTGGCAAAATTCGTTGACA ACTCTTAGGCCCCATCTGGT MYB59 transcription factor ACGAAAAGGACCATGGACAG CCTCTTTTGAGACCAGGATGA Nodulin TCTCGGTGCATTTGAAAACA CTTGTTTCTTGTTGCCAAGG Oxysterol-bining AAGGGCAACCTTAACGTCCT AACGTTCCCCTCTGGTTTCT Peroxidase1 TTGACATGGCTGTCATGGTT GCATGCAATGAAATGTTTGG Peroxidase2 CGATTGCATTTCCATGACTG GATTGAAAAAGGGTGCAGGA phospholipase C TTGATCAGTTCTCACCTCCAGA TTATCTCCTTTGCGGGAATG PR protein TCCAGCACAGGCAGTAAGTG ACCACATGCCAAACAGACAA R protein GATGGCTCTTACCCAAGCTG CGGGACTGTAATCCAACAGG Leguminosins GCGCGCTTTGTAACTGAAAT TTGCTGCATATCTCGTCCAG Stress-related kinase CTAGACTCATGCGCCACAAA TGCTATGTAGGCTGGGGTTC Ubiquitin protease TGGGCCAAATATTTTTCAAGA CACAATCGGTTCATTTTCCA atDrm GGGATGATACTGTCGCTGGT GGAGCTGAACTGCCTTGGTA atGRP7 GGTGCTTCGTTGGAGGTCTA TGTCCGTTCATTCCCTCAAT atGRP8 AAAGGACGTTCTCACAGTTCG CTCGTTCACGGTGATGACAC mtGRPb TTCGATTCGATGGGTTCTTC CTCCTCAGCGAATGTCACAA mtGRPc ATGGCTTCTGCAGATGTTGA TGACCGTTCATCCCTTCAAT Supplemental Table S1. Sequences of PCR primers used in this study. In order to detect AS isoforms, primers were designed for twenty Medicago AS candidate genes. Additional primers were made for two Medicago GRP homologs, Arabidopsis Drm and GRP genes (grey region) .

Supplemental Table S3. Orthologous genes between Medicago and rice showing AS in both species

Embed Size (px)

DESCRIPTION

Supplemental Table S1. Sequences of PCR primers used in this study. In order to detect AS isoforms, primers were designed for twenty Medicago AS candidate genes. Additional primers were made for two Medicago GRP homologs, Arabidopsis Drm and GRP genes (grey region). - PowerPoint PPT Presentation

Citation preview

Page 1: Supplemental Table S3.  Orthologous genes between  Medicago  and rice showing AS in both species

Gene name L primer R primerBranch point protein GATCATCTCTCCGGGTTCAA GCTCACTTTGCCAAATCCATbZIP transcription factor CAAGCCCCTTTTAGGGATCT GGAGATCATCAAGCCACGATAuxin response factor CCCTTGGTTGTGTTCAAAGC CTGCCATGCTCCATACCATAAuxin-repressed protein, Drm CAAACTCCGGAAACACCCTA TTACCGGAAAACGGAGACACembryonic protein ACCAAGGCATGAGTGTCTCC AGCTCTGCTCCAATCACTCCExtensin TCCACCTCCATCTCCATCTC CGACCTCGTCTGAATCACAAGlycine-rich RNA-bining protein GTTGAATACCGGTGCTTCGT GGCATCGTTCATGGATTTCTMalate dehydrogenase ATGCCTCCATGGTTTGAAAG CAACAGCTAAGCTATGGATAACACASplicing factor TTTGGCAAAATTCGTTGACA ACTCTTAGGCCCCATCTGGTMYB59 transcription factor ACGAAAAGGACCATGGACAG CCTCTTTTGAGACCAGGATGANodulin TCTCGGTGCATTTGAAAACA CTTGTTTCTTGTTGCCAAGGOxysterol-bining AAGGGCAACCTTAACGTCCT AACGTTCCCCTCTGGTTTCTPeroxidase1 TTGACATGGCTGTCATGGTT GCATGCAATGAAATGTTTGGPeroxidase2 CGATTGCATTTCCATGACTG GATTGAAAAAGGGTGCAGGAphospholipase C TTGATCAGTTCTCACCTCCAGA TTATCTCCTTTGCGGGAATGPR protein TCCAGCACAGGCAGTAAGTG ACCACATGCCAAACAGACAAR protein GATGGCTCTTACCCAAGCTG CGGGACTGTAATCCAACAGGLeguminosins GCGCGCTTTGTAACTGAAAT TTGCTGCATATCTCGTCCAGStress-related kinase CTAGACTCATGCGCCACAAA TGCTATGTAGGCTGGGGTTCUbiquitin protease TGGGCCAAATATTTTTCAAGA CACAATCGGTTCATTTTCCAatDrm GGGATGATACTGTCGCTGGT GGAGCTGAACTGCCTTGGTAatGRP7 GGTGCTTCGTTGGAGGTCTA TGTCCGTTCATTCCCTCAATatGRP8 AAAGGACGTTCTCACAGTTCG CTCGTTCACGGTGATGACACmtGRPb TTCGATTCGATGGGTTCTTC CTCCTCAGCGAATGTCACAAmtGRPc ATGGCTTCTGCAGATGTTGA TGACCGTTCATCCCTTCAAT

Supplemental Table S1. Sequences of PCR primers used in this study. In order to detect AS isoforms, primers were designed for twenty Medicago AS candidate genes. Additional primers were made for two Medicago GRP homologs, Arabidopsis Drm and GRP genes (grey region) .

Page 2: Supplemental Table S3.  Orthologous genes between  Medicago  and rice showing AS in both species

Supplemental Table S2. Orthologous genes between Medicago and Arabidopsis showing AS in both species. Four basic AS types were indicated as AD for alternative donor, AA for alternative acceptor, ES for exon skipping, and RI for intron retention. Exon/intron structure was examined based on the number and size of exons and introns. Conserved AS genes showed same AS type in comparable exon or intron.

EGO ID Medicago TC AS type Arabidopsis TC AS type Exon/Intron Conserved AS894524 TC94246 RI TC251367 Other Similar N895321 TC107463 RI TC251597 RI Different N895142 TC94406 RI TC251668 RI Similar N894090 TC93939 AD TC251828 AD Similar Y896371 TC94400 ES TC251883 AA Similar N894974 TC103951 Other TC251930 ES Different N900081 TC95704 ES TC252381 Other Different N901702 TC107811 AD TC251464 AA Similar N902625 TC103450 Other TC252427 RI Different N902626 TC105045 ES TC252457 AA,AD Different N894199 TC107238 Other TC252801 AA,AD Different N915297 TC96361 RI TC253222 AD Similar N915321 TC94153 RI TC253860 RI Similar Y896415 TC94889 Other TC254095 ES Similar N899485 TC96374 Other TC254097 RI Similar N897943 TC110853 ES TC254268 AD Similar N898375 TC105891 RI TC254715 RI Similar N896913 TC95796 Other TC257117 AA Similar N894555 TC94062 ES TC261326 AD Similar N897992 TC94773 AD TC261808 AD Similar N923867 TC103376 Other TC262105 RI Different N919078 TC94653 RI TC263047 RI Similar Y895886 TC102978 Other TC263103 RI Different N906752 TC108996 Other TC263526 ES Different N901890 TC110977 AA TC263880 Other Different N896973 TC101061 ES TC264245 AA,RI Similar N901901 TC111448 Other TC264424 RI,AA,AD Different N904007 TC98940 Other TC265104 Other Similar N896490 TC96927 Other TC271064 AA Similar N897700 TC107290 RI TC271372 ES,AA,AD Similar N

Page 3: Supplemental Table S3.  Orthologous genes between  Medicago  and rice showing AS in both species

Supplemental Table S3. Orthologous genes between Medicago and rice showing AS in both species.

EGO ID Medicago TC AS type Rice TC AS type Exon/Intron Conserved AS894846 TC100632 Other TC248905 RI Similar N894334 TC107377 RI TC261167 ES, AA Similar N894346 TC106667 Other TC247924 AA, AD Similar N894400 TC107075 Other TC249764 ES, AA, AD Similar N894972 TC107574 RI TC251418 ES, AA, AD Similar N895004 TC106179 Other TC264205 RI Different N895754 TC93935 Other TC279637 RI Different N895841 TC100891 RI TC262664 RI Similar N897177 TC107310 AD TC249956 RI Different N897994 TC103338 Other TC275793 Other Similar N899808 TC99224 RI TC269359 AA,RI Different N900783 TC112257 Other TC266015 Other Different N900871 TC98778 Other TC266125 AA Similar N902131 TC96526 Other TC250035 ES,RI Different N919078 TC94653 RI TC276864 RI Similar Y963200 TC111508 RI TC269359 RI,AA Similar N

Page 4: Supplemental Table S3.  Orthologous genes between  Medicago  and rice showing AS in both species

Supplemental Table S4. Orthologous genes between rice and Arabidopsis with AS in both species.

EGO ID Rice TC AS type Arabidopsis TC AS type Exon/Intron Conserved AS894019 TC279241 ES,AA TC252628 Other Different N894196 TC249477 AD TC260991 ES, AA, AD Similar N894367 TC249232 ES TC252171 AA Different N894387 TC252572 Other TC253248 AD Different N894418 TC258983 AA TC254165 ES Similar N894546 TC252869 RI TC257968 AA Different N894557 TC248638 AA, AD TC270599 Other Different N894662 TC253347 AA TC264106 AD Similar N894875 TC249346 AA,RI TC261646 Other Different N894924 TC252707 RI TC274400 ES,AA Different N895009 TC248418 AA, AD TC270609 Other Different N895069 TC262955 RI TC253233 RI Similar N895254 TC248647 RI TC255025 RI Similar Y895278 TC262569 RI TC270647 Other Different N895469 TC247935 RI TC263890 AA, AD Similar N895533 TC260621 ES TC256396 ES Different N895576 TC249047 ES TC261305 RI Similar N895599 TC250139 AD,ES TC264543 ES Different N895753 TC256600 Other TC266364 ES Different N895755 TC276580 RI TC270816 Other Similar N895827 TC262437 Other TC252072 AD Different N895917 TC248179 AD TC271574 RI Different N896105 TC276830 RI TC273965 RI Different N896142 TC279866 ES TC266192 AA Different N896189 TC265503 RI TC252578 Other Different N896427 TC265653 ES TC255275 AA Different N896477 TC266786 Other TC264306 AA,ES Different N896637 TC275172 AD TC265319 AA Different N896680 TC251855 RI TC261346 ES Different N896701 TC264628 RI TC262704 AD Similar N896850 TC247941 RI TC251505 RI Different N897626 TC248800 RI TC262040 RI Different N897682 TC262402 RI TC269897 RI Similar N897695 TC262890 AD TC271256 ES Different N897719 TC249699 RI TC274114 AA,AD,ES Similar N897959 TC250370 RI TC256567 ES Similar N898472 TC252696 Other TC276415 AA Similar N899417 TC256678 Other TC251551 AA Similar N900340 TC259380 RI TC271453 AA,AD Similar N901033 TC278938 RI TC263879 Other Different N901196 TC264936 AA TC273818 AD Different N901816 TC268910 ES TC258030 RI Similar N901855 TC263013 AA TC262479 Other Similar N901876 TC251315 ES TC263079 RI Different N901901 TC268961 RI TC264424 AA, AD Different N

Page 5: Supplemental Table S3.  Orthologous genes between  Medicago  and rice showing AS in both species

Supplemental Table S4. Continued

902592 TC249676 RI TC263382 AA Different N902797 TC269809 RI TC261869 AD Different N903010 TC273363 ES TC273137 RI Different N903747 TC250572 ES TC252465 AA, AD Different N904110 TC277392 Other TC272978 AA Similar N904141 TC262876 Other TC273896 ES, AA Different N904144 TC280576 Other TC273998 Other Similar N905285 TC272901 ES TC264030 RI Different N906573 TC266347 Other TC254640 ES,AD Different N906788 TC253671 AD TC264805 Other Similar N906811 TC253923 RI TC265559 AA Similar N906913 TC278477 Other TC272001 AA Similar N908281 TC264430 RI TC254281 ES Different N908646 TC274740 AA TC272453 RI Similar N908662 TC276487 Other TC272854 AA Similar N910805 TC283051 AA TC274916 AA, AD, RI Similar N919078 TC276864 RI TC263047 RI Similar Y924076 TC260302 RI TC266894 Other Different N924166 TC248037 ES,RI TC270742 ES,AA Similar N951276 TC260882 AA,AD TC275877 Other Similar N

Page 6: Supplemental Table S3.  Orthologous genes between  Medicago  and rice showing AS in both species

Supplemental Table S5. Orthologous genes among Medicago, Arabidopsis and rice showing AS in all species.

EGO ID Medicago TC AS type Arabidopsis TC AS type RiceTC AS type Exon/Intron Conserved AS893958 TC102929 ES TC255114 RI TC252784 ES Different N894400 TC107075 Other TC271789 Other TC249764 AA,AD,ES Similar N893958 TC109965 RI TC255114 RI TC252784 Other Similar N901901 TC111448 Other TC264424 AA,AD TC268961 RI Similar N894090 TC93939 AD,RI TC251828 AD,RI TC261824 AD,RI Similar Y919078 TC94652 AD,RI TC263047 AD,RI TC276864 AD,RI Similar Y

Page 7: Supplemental Table S3.  Orthologous genes between  Medicago  and rice showing AS in both species

Supplemental Table S6. Correlation matrix of AS incidence between species and across GO categories.

Species Arabidopsis Medicago Rice

Arabidopsis 1.00 0.57 0.00

Medicago 0.57 1.00 -0.01

Rice 0.00 -0.01 1.00

Values correspond to the Pearson’s correlation. coefficient.

Page 8: Supplemental Table S3.  Orthologous genes between  Medicago  and rice showing AS in both species

Arabidopsis

Medicago

Poplar

Rice

Supplemental Figure S1. Sequences of exon/intron junctions at donor (A) and acceptor sites (B). Nucleotide sequences of exon/intron junctions were extracted based on the Spidey alignment. Pictograms were obtained from http://genes.mit.edu/pictogram.html using the extracted data.

A

BArabidopsis

Medicago

Poplar

Rice

Page 9: Supplemental Table S3.  Orthologous genes between  Medicago  and rice showing AS in both species

Supplemental Figure S2. Sequence alignment of plant U1 snRNAs. The sequence data were collected from public databases via either publication or sequence homology search. Data from Arabidopsis, tomato, Medicago, pea, soybean, rice, and wheat were presented here. The 5’ half of the alignment was shown here and nine nucleotides conserved at the 5’-ends were indicated with a red line.

Page 10: Supplemental Table S3.  Orthologous genes between  Medicago  and rice showing AS in both species

Supplemental Figure S3. Distribution of UA content in plant introns (A) and exons (B). Fraction of exon/intron was plotted in percentage with 10% interval of UA content.

0

10

20

30

40

50

60

70

80

90

<30 31-40 41-50 51-60 61-70 71-80 81-90

% of UA

% of introns

ArabidopsisMedicagoPoplarRice

0

10

20

30

40

50

60

70

80

<30 31-40 41-50 51-60 61-70 71-80 81-90

% of UA

% of exons

ArabidopsisMedicagoPoplarRice

A

B

Page 11: Supplemental Table S3.  Orthologous genes between  Medicago  and rice showing AS in both species

ATG TAA

ATG TAA

ATG TAA

mtMYB_spliced

TC101530 (AD)

TC101529 (RI)

A

B

Supplemental Figure S4. AS of an mtMYB1 gene. (A) ORF structures of AS products. Filled boxes represent peptide coding regions and open boxes denote spliced exon regions without ORF. Lines show introns removed during splicing. Location of PCR primers were indicated as arrows. (B) Alignment of polypeptide sequences of the mtMYB1 AS products. (C) RT-PCR of the mtMYB1 from Medicago leaf and root tissues.

C

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

L RMt

Page 12: Supplemental Table S3.  Orthologous genes between  Medicago  and rice showing AS in both species

Supplemental Figure S5. Comparison of Medicago polypeptide sequences deduced from AS isoforms of Drm (A), PR protein (B), and peroxidase (C) genes.

A

B

C

Page 13: Supplemental Table S3.  Orthologous genes between  Medicago  and rice showing AS in both species

bps

Supplemental Figure S6. Duplication of Medicago GRP genes. Three copies of the mtGRP genes were found in a BAC clone with tandem repeat (mth2-10P20, GenBank accession # ; AC134242). Diagram of the gene prediction was obtained from http://dna8.genome.ou.edu/cgi-bin/gbrowse/medicago_new. The mtGRP genes were indicated as red ovals with the corresponding TC IDs.

mtGRP1, TC93939mtGRP2, TC96436 mtGRP3, TC100232

Page 14: Supplemental Table S3.  Orthologous genes between  Medicago  and rice showing AS in both species

A

Supplemental Figure S7. A. Sequence similarity of orthologous GRP gene from Arabidopsis and Medicago. Polypeptide sequences were inferred from cDNA sequences and aligned by means of the ClustalW algorithm. B. Conserved AS patterns of GRP gene homologs in Arabidopsis and Medicago. Total RNA was extracted from leaf (L) and root (R) tissues of Arabidopsis and Medicago and RT-PCR was performed with specific primer sets for each GRP gene. Three different amplicons were detected in each GRP gene from agarose gel electrophoresis. Individual band was extracted from agarose gel and verified with DNA sequencing. Phylogenetic tree was drawn with neighbor joining method using the PAUP program based on the deduced amino acid sequences.

atGRP7

atGRP8

mtGRP2

mtGRP1

mtGRP3

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

L R

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

L R

L R

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

L R

L R

B

Page 15: Supplemental Table S3.  Orthologous genes between  Medicago  and rice showing AS in both species

Supplemental Figure S8. Sequence homology among MYB orthologous genes of Medicago (mt), Arabidopsis (at), and rice (os). Peptide sequences were inferred from fully spliced transcript isoforms of each gene. The matrix of sequence identities was produced by means of the ClustalW algorithm.

Page 16: Supplemental Table S3.  Orthologous genes between  Medicago  and rice showing AS in both species

Supplemental Figure S9. AS frequency in different GO categories. To assess whether AS frequency was correlated with particular gene functions, all transcripts (TCs) with deduced intron-exon structures were grouped according to the GeneOntology (GO) classification system, using TC to GO mapping data available through the TIGR GeneIndex database. Only GO categories containing at least 1% of the annotated TCs in each species were considered in this analysis. By way of example, for Medicago, 4,838 transcripts could associated to the 10 GO categories, including 266 transcripts represented by multiple AS isoforms. The numbers of TCs for Arabidopsis and rice that were assignable to these same 10 GO categories were 13,585 (total) and 2,797 (AS) for Arabidopsis, and 7,700 (total) and 911 (AS) for rice. Although the number of TCs varied significantly between GO categories, the relative distribution of TCs across the GO categories was similar between species (Supplemental Figure S9-1), suggesting that differences in AS frequency between species was unlikely to be influenced by TC distribution. Similarly, to determine whether differences in gene expression levels between species and/or GO categories might be correlated with differences in AS frequency, we plotted AS incidence as a function of the number of ESTs in each transcript (i.e., the relative in silico gene expression value). As shown in Supplemental Figure S9-2, AS incidence was similar across the range of inferred gene expression levels, with similar results in all 3 species. Thus, it seems unlikely that differences in gene expression would contribute significantly to differences in AS frequency, especially when AS frequency is expressed as an aggregate value for groups of genes (i.e., GO categories). The relative occurrence of AS in a particular GO category is referred to as AS bias, and represents the ratio of percent AS TCs in a given GO category to the percentage of total TCs in that same category. Thus, a category containing 5% of all TCs, and in which 7.5% of TCs are AS isoforms, would be overrepresented for AS and have an AS bias of 1.5 (i.e., 7.5%/5%). Values of 1 represent no bias, while values of <1 indicate under representation of AS.

Page 17: Supplemental Table S3.  Orthologous genes between  Medicago  and rice showing AS in both species

Supplemental Figure S9-1. TC frequency in GO categories. In each species, the number of TC was counted from different GO groups and its percentages of total number of TC were plotted.

Page 18: Supplemental Table S3.  Orthologous genes between  Medicago  and rice showing AS in both species

Rice

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

2-10 11-20 21-30 30-40 41-50 51-60 61-70 71-80 81-90 >91

# of ESTs

# of TCs

0.0

5.0

10.0

15.0

20.0

% of AS TC

# TC% of AS

Arabidopsis

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

2-10 11-20 21-30 30-40 41-50 51-60 61-70 71-80 81-90 >91

# of ESTs

# of TCs

0.0

5.0

10.0

15.0

20.0

% of AS TC

TC% AS TC

Medicago

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

2-10 11-20 21-30 30-40 41-50 51-60 61-70 71-80 81-90 >91

# of ESTs

# of TCs

0.0

5.0

10.0

15.0

20.0

% of AS TC

# TC% of AS

18,479

28,950

13,676

Supplemental Figure S9-2. Number of ESTs in TC and AS frequency. The number of TC in different EST counts were shown in bar graph and AS incidence was plotted in percentage as a function of the EST counts.

Page 19: Supplemental Table S3.  Orthologous genes between  Medicago  and rice showing AS in both species

Supplemental Figure S10. We observed both differences in AS frequency between GO categories, and differences between species within individual GO categories. Correlation analysis was used to compare the AS bias differences observed between species. Principal component analysis (PCA) indicates that the first 2 components explain 85% of the total variation observed in the dataset. The Arabidopsis and Medicago datasets, but not rice, were highly correlated with the first principal component (PC1), while rice was correlated with PC2. These relationships are also reflected in the correlation matrix shown in Supplemental Table S6, where a significant positive correlation was evident between the Arabidopsis and Medicago datasets, but neither dicot was correlated with rice. Gene ontology categories that contributed most to the variability explained by PC1 were transferase activity, membrane, and cell communication. On the other hand, cell communication and hydrolase activity, followed by development and oxidoreductase activity, were large components of the variability explained by PC2. AS bias for gene ontology categories such as nucleic acid binding and biosynthetic process were consistent with the correlation analysis, revealing differences between dicot and monocot datasets, but had a small contribution to the overall dataset variance. By contrast, AS bias was well correlated for all three species in the GO kinase activity category.