11
RNA meets promoter (The story of Old, New, Borrowed and Blue) Wolfgang Otto Chair in Bioinformatics, University of Leipzig Herbstseminar, October 2009

RNA meets promoter - uni-leipzig.de · 2010. 6. 18. · • ncRNAs are transcribed by polymerase II and III • extracted 100nt upstream of U3 snoRNA, 7SK RNA, SL1 and SL2 RNA, SmY

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: RNA meets promoter - uni-leipzig.de · 2010. 6. 18. · • ncRNAs are transcribed by polymerase II and III • extracted 100nt upstream of U3 snoRNA, 7SK RNA, SL1 and SL2 RNA, SmY

RNA meets promoter(The story of Old, New, Borrowed and Blue)

Wolfgang Otto

Chair in Bioinformatics, University of Leipzig

Herbstseminar, October 2009

Page 2: RNA meets promoter - uni-leipzig.de · 2010. 6. 18. · • ncRNAs are transcribed by polymerase II and III • extracted 100nt upstream of U3 snoRNA, 7SK RNA, SL1 and SL2 RNA, SmY

Introduction Method Results

Motivation

• background: ncRNA annotationwithin the worms (in thispresentation only theCaenorhabditis, P. pacificus hasslightly different promoters whileB. malayi, M. haplanaria andM. incognita acrita have norecognisable promoter elements)

• goal: find new members of thencRNA families (and maybe genesof unknown families)

Page 3: RNA meets promoter - uni-leipzig.de · 2010. 6. 18. · • ncRNAs are transcribed by polymerase II and III • extracted 100nt upstream of U3 snoRNA, 7SK RNA, SL1 and SL2 RNA, SmY

Introduction Method Results

Idea

• find ncRNA genes by theirpromoters

1. analyse promoters of knownworm ncRNA genes

2. search for similar motifs in wormgenomes

3. checking DNA downstream ofpromoter like regions

Page 4: RNA meets promoter - uni-leipzig.de · 2010. 6. 18. · • ncRNAs are transcribed by polymerase II and III • extracted 100nt upstream of U3 snoRNA, 7SK RNA, SL1 and SL2 RNA, SmY

Introduction Method Results

Promoter Analyses

• ncRNAs are transcribed by polymerase II and III

• extracted 100nt upstream of U3 snoRNA, 7SK RNA, SL1 andSL2 RNA, SmY RNA, U1, U2, U4 and U5 snRNA (pol II) and100nt upstream of RNase MRP, RNase P, sbRNA andU6 snRNA (pol III)

• align extracted sequences based on PSE-B box (included inevery promoter sequence)

• sequences without promoter sequences were used to identifypseudo-genes

Page 5: RNA meets promoter - uni-leipzig.de · 2010. 6. 18. · • ncRNAs are transcribed by polymerase II and III • extracted 100nt upstream of U3 snoRNA, 7SK RNA, SL1 and SL2 RNA, SmY

Introduction Method Results

Global Promoter Search

• used alignment to create fragrep pattern for each kind ofpromoter sequence

• remarkable: two kinds of PSE-B boxes in pol II promoter

pol II (a) U3 snoRNA, 7SK RNA, SL1 RNApol II (b) SL2 RNA, SmY RNA, U1, U2, U4 and

U5 snRNA

Page 6: RNA meets promoter - uni-leipzig.de · 2010. 6. 18. · • ncRNAs are transcribed by polymerase II and III • extracted 100nt upstream of U3 snoRNA, 7SK RNA, SL1 and SL2 RNA, SmY

Introduction Method Results

Global Promoter Search

• pattern search with fragrep and relative variable scores ⇒

high number of candidates

• reduce number of false positive hits• use mismatch-based matrix similarity score (mmS)• shuffle genomes and repeat fragrep search, hits are 100%

false positive• define cutoff mmS where only 1% of the false positive hits

remain• use this cutoff mmS for original hits

Page 7: RNA meets promoter - uni-leipzig.de · 2010. 6. 18. · • ncRNAs are transcribed by polymerase II and III • extracted 100nt upstream of U3 snoRNA, 7SK RNA, SL1 and SL2 RNA, SmY

Introduction Method Results

The Story of Old, New, Borrowed and Blue

• remaining hits are assumed to be real promoter sequences

1. adjust hits of putative ncRNAs, used for the patterngeneration [⇒old and borrowed]

2. look for new members of known families [⇒new]

• generate fasta db consisting of all known ncRNAs• blast search for downstream sequence of the hits (100nt) in

the fasta db

3. check remaining hits for ncRNA genes of unknown families[⇒blue]

Page 8: RNA meets promoter - uni-leipzig.de · 2010. 6. 18. · • ncRNAs are transcribed by polymerase II and III • extracted 100nt upstream of U3 snoRNA, 7SK RNA, SL1 and SL2 RNA, SmY

Introduction Method Results

The Blue Part

• extract 70nt downstream of each promoter hit

• create clusters of similar sequences with blastclust

• remove all clusters that contain only sequences from onespecies

• check remaining inter-species clusters for potential ncRNAgenes (yet only UCSC genome browser)

• ⇒11 potential new ncRNA families

Page 9: RNA meets promoter - uni-leipzig.de · 2010. 6. 18. · • ncRNAs are transcribed by polymerase II and III • extracted 100nt upstream of U3 snoRNA, 7SK RNA, SL1 and SL2 RNA, SmY

Introduction Method Results

Cluster 1 (c000124, Pol III)

• chrI:8,245,471-8,245,683(-); MFE = −14.50kcal/mol

• EST, GB: BJ118936.1 (unpublished oligo-capped cDNAlibrary, L1 stage)

ScalechrI:

Gap

Other RefSeq

Spliced ESTs

Conservation

c_remaneic_briggsaec_brenneric_japonicap_pacificus

P. pacificus Net

C. japonica Net

C. brenneri Net

C. briggsae Net

C. remanei Net

RepeatMasker

100 bases8245500 8245550 8245600 8245650

Gap Locations

GC Percent in 5-Base Windows

WormBase Gene AnnotationsRefSeq Genes

Non-C. elegans RefSeq Genes

C. elegans mRNAs from GenBankC. elegans ESTs That Have Been Spliced

Multiz Alignment & Conservation (6 nematodes)

P. pacificus (Feb. 2007/priPac1) Alignment Net

C. japonica (Mar. 2008/caeJap1) Alignment Net

C. brenneri (Feb. 2008/caePb2) Alignment Net

C. briggsae (Jan. 2007/cb3) Alignment Net

C. remanei (May 2007/caeRem3) Alignment Net

Repeating Elements by RepeatMasker

GC Percent

70 _

30 _

AUUAA

UUU

A GUUGCAGUGACC

UC

A

GC

GU

CCAA

UU

CAU

AA

U C AU

UG U UU

CUA

AGACGGCACUUCC

UAG A A

C AG

UUUC

AU

CACCGGUCUGCAAU

AAA

Page 10: RNA meets promoter - uni-leipzig.de · 2010. 6. 18. · • ncRNAs are transcribed by polymerase II and III • extracted 100nt upstream of U3 snoRNA, 7SK RNA, SL1 and SL2 RNA, SmY

Introduction Method Results

Cluster 2 (c000147, Pol III)

• chrII:5,599,660-5,599,872(+); MFE = −21.00kcal/mol

ScalechrII:

Gap

Other RefSeq

Spliced ESTs

Conservation

c_remaneic_briggsaec_brenneric_japonicap_pacificus

P. pacificus Net

C. japonica Net

C. brenneri Net

C. briggsae Net

C. remanei Net

RepeatMasker

100 bases5599650 5599700 5599750 5599800

Gap Locations

GC Percent in 5-Base Windows

WormBase Gene AnnotationsRefSeq Genes

Non-C. elegans RefSeq Genes

C. elegans mRNAs from GenBankC. elegans ESTs That Have Been Spliced

Multiz Alignment & Conservation (6 nematodes)

P. pacificus (Feb. 2007/priPac1) Alignment Net

C. japonica (Mar. 2008/caeJap1) Alignment Net

C. brenneri (Feb. 2008/caePb2) Alignment Net

C. briggsae (Jan. 2007/cb3) Alignment Net

C. remanei (May 2007/caeRem3) Alignment Net

Repeating Elements by RepeatMasker

GC Percent

70 _

30 _

UC

UA

CC

CG

UGAU

GAAGAAAUUAGAUCCA

AC

UCCCA

G G CCA

GU

UGA U A C

GU

CU

UC

UG

GU U C

AU

GCA

GAUA

AAG

GC

G

A

ACG

AACGGGUU

UU

GA U

AA

UUUU

GG

Page 11: RNA meets promoter - uni-leipzig.de · 2010. 6. 18. · • ncRNAs are transcribed by polymerase II and III • extracted 100nt upstream of U3 snoRNA, 7SK RNA, SL1 and SL2 RNA, SmY

Introduction Method Results

Cluster 3 (h0000010, Pol III)

• chrII:14,635,601-14,636,068(-)

ScalechrII:

Other RefSeq

Spliced ESTs

Conservation

c_remaneic_briggsaec_brenneric_japonicap_pacificus

P. pacificus Net

C. japonica Net

C. brenneri Net

C. briggsae Net

C. remanei Net

RepeatMasker

100 bases14635650 14635700 14635750 14635800 14635850 14635900 14635950 14636000 14636050

WormBase Gene Annotations

RefSeq Genes

Non-C. elegans RefSeq Genes

C. elegans mRNAs from GenBankC. elegans ESTs That Have Been Spliced

Multiz Alignment & Conservation (6 nematodes)

P. pacificus (Feb. 2007/priPac1) Alignment Net

C. japonica (Mar. 2008/caeJap1) Alignment Net

C. brenneri (Feb. 2008/caePb2) Alignment Net

C. briggsae (Jan. 2007/cb3) Alignment Net

C. remanei (May 2007/caeRem3) Alignment Net

Repeating Elements by RepeatMasker

C38C6.4

sre-13