11
Ramy K. Aziz San Diego State University & Cairo University Rocky 2009 Dec 10 2009 Nature’s most successful genes?

Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)

Embed Size (px)

DESCRIPTION

Oral presentation in the 7th Annual Rocky Mountain Bioinformatics Meeting http://www.iscb.org/rocky09 Most results were published in http://nar.oxfordjournals.org/cgi/content/full/gkq140v1

Citation preview

Page 1: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)

Ramy K. AzizSan Diego State University & Cairo University

Rocky 2009Dec 10 2009

Nature’s mostsuccessful genes?

Page 2: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)
Page 3: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)
Page 4: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)

• What is prevalence? For an object x,

– Ubiquity (number of sets to which x belongs)

– Abundance (“average” frequency of x in a set)

@sets = (genomes, metagenomes, biomes)

• What to count? (PEG/ EGT/ function/ family)?

• How to count? and where (genomes/ MGs)?

– Gene length matters frequency / gene length

– Metagenome size matters relative abundance

Spelling out the question:half the way to the answer

Page 5: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)

• Current knowledge:RuBisCo* (*ribulose-1,5-bis phosphate carboxylase) is the enzyme with the highest copy number (mass?) in ecosystems. However, its gene is neither the most ubiquitous nor the most abundant

• Any guesses? (an enzyme? a transcription factor? a transporter? DNA

metabolism? Carbohydrate metabolism?)

– Guess 1:

– Guess 2:

– Guess 3:

Spelling out the question:half the way to the answer

Page 6: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)

And the winner is …

Page 7: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)

And the winner is …

Page 8: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)

Metagenomes

187 sets;

6

million sequences

Pearson Corr.0.524 eco-essentiality

Life essentials

fert

ility

Habitat -specific

Page 9: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)

Gene ubiquity in genomes (2,137)

Pearson Corr.0.645

Transposase

ABC transporterATP-binding

Glycosyltransferase

ABC transporterpermease

Two-component Sensor/ Regulator

tRNA synthetases

Page 10: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)

(How/Why) Does it matter?

• Current annotations suck! Improvement needed.

• Transposases no longer ‘junk hypothetical proteins’; their quorum dictates attention!

• The ‘selfish’ transposase genes must be offering their hosts some advantage.

• If rRNA is used to track genomes’ vertical history, transposases can track ‘horizontal’ history.

• Cheaters (always?) win…

• Transposases shall inherit the earth?

Page 11: Nature's most abundant genes (ISCB Rocky Mountain Bioinformatics Meeting 2009)

• This study could not have been possible without…

Rob Edwards & Mya Breitbart

And:

• Forest Rohwer, Liz Dinsdale, Anca Segall, Peter Salamon, & the Math group

• NSF funding (PhAnToMe grant)

Acknowledgment