Upload
cornelia-tyler
View
215
Download
0
Embed Size (px)
DESCRIPTION
% hits at each e-value “Unknowns” are not as conserved as “knowns”, even between related organisms! BLAST e-value Yeast
Citation preview
Gene models and proteomes for Saccharomyces cerevisiae (Sc), Schizosaccharomyces pombe (Sp), Arabidopsis thaliana (At), Oryza sativa (Os), Drosophila melanogaster (Dm), Anopheles gambiae (Ag), Caenorhabditis elegans (Ce), Mus musculus (Mm), Rattus norvegicus (Rn), and Homo sapiens (Hs) were downloaded from the NCBI website (ftp.ncbi.nlm.nih.gov).
HMMPFAM search against several major signature databases- PFAM, TIGRFAM, SMART, and Superfamily
match to one or more of the models in any one of the databases
“Known (PDF)”
no matches to any one of the models in any database
“Unknown(POF)”
Our Definition of genes with unknown function
What makes species different?
A study of Unique Genes
% o
f G
enom
e“Unknowns” account for about 25% of each genome
% h
its a
t eac
h e-
valu
e
“Unknowns” are not as conserved as “knowns”, even between related organisms!
BLAST e-value
Yeast
Outl.
ScSpAt
Os
DmAg
MmRn
Hs
Ce
Known Unknown
Relationship tree among the 10 different genomes reveals a high degree of evolutionary divergence
among “unknowns” from different species
“Unknowns” have a different rate of evolution?“Unknowns” are new genes?
836
1197
487
882
1908
3384
5832
19157
792
2919
2041
5601
133
1440
20528
196
3173
5694
5955
% U
niqu
e ge
nes
KnownUnknown
“Unknowns” are mainly species-specific.
Representation of “unknowns” in the “unique-ome” of different species.
“Unique-ome” was defined by a BLAST cut off of 10-6. Between the 10 different genomes!
0.000.050.100.150.200.250.300.350.400.45
Sc Sp At Os Dm Ag M m Rn Hs Ce
Dis
orde
r/len
gth
0.00
0.10
0.20
0.30
0.40
0.50
0.60
Sc Sp At Os Dm Ag M m Rn Hs Ce
Hyd
ro in
dex
0.00
100.00
200.00
300.00
400.00
500.00
600.00
700.00
Sc Sp At Os Dm Ag M m Rn Hs Ce
Avg
Seq
Len
gth
(aa)
KnownUnknown
Compared to “knowns”, “Unknowns” are more disordered, less hydrophobic and shorter.
“Unknown” Conclusions• Unknown genes are typically species-specific and might provide
some of the keys that define species-specific differences. • Unraveling the function of “unknowns” would improve our
understanding of species-specific functions.• Disordered protein functions are thought to include the formation
and regulation of large multi-molecular assemblies that participate in important regulatory functions. Disordered regions on proteins have been reported to evolve significantly more rapidly than ordered regions.
• “Unknowns” are likely to be the result of greater evolutionary divergence among species leading to the establishment of new, species-specific regulatory networks.