Loss of information at deeper divergences · 2014-11-14 · Loss of information at deeper...

Preview:

Citation preview

Loss of information at deeper divergences

David Penny Phylomania Nov 2013

The mathematicos caused the problem!!! Now they should solve it!

Okay, maybe we could help them, Here are some ideas

Need relative – not absolute - information

the comfort zone

ML Int

ML Rel

Mlav ML

MLan MP ML

MLep MP MP

popn classic phylogeny deep

phylogeny

can we go further back

in time?

Markov models - Loss of information

damned eukaryotes!

Calculated results, Δ ≤ ¼ + ne-qt

-0.2

0

0.2

0.4

0.6

0.8

1

1 10 100 1000 10000

0.01 0.005 0.002 0.001

0%

20%

40%

60%

80%

100%

120%

0.1 1 10

pe

rce

nta

ge

of

tre

es

co

rre

ct

d=0.001

d=0.100

d=0.500

d=1.000

d=2.000

d=5.000

infinite

1. simulation results with covarion model

number of internal edges correct, out of 6neighbor joining, 9 taxa, 1000 columns, i.i.d.

00.5

1

5 8 13 20 32 50 80 125

200

320

500

790

1250

2000

millions of years (log scale)

6

5

4

3

2

1

0

simulation results with standard model

1 idea, delete fast sites

If there were a mixture of a) faster evolving sites, and b) and we could identify them c) and remove them would that help go further back in time?

deleting faster sites

Ancestral Sequence Reconstruction

Giardia animals plants

2, 3, testing

Ancestral Sequence Reconstr-

uction

vaults 3-D info

subgroups X and Y

a b c d e k l m n o

ax a

y

subgroup X subgroup Y

4. gene length vs similarity

f1 a . . . . a . . . . . .

f2 . . . . . a . . . . a .

f3 . . . . . a . . . . . a

f4 . . . . . a . . . . . .

g1 . a a a a . a a . . . .

g2 a a a a a . . a . . a .

h1 . . a a a . . . a a . .

h2 . . a . a . . . a a . .

h3 a . . a a . . . a a . .

i j 3 4 5 6 7 8 9 0 1 2

i j i j

. . . a

a . a a

upper bound = 17

lower bound = 12

?

13

Would weighting by incompatibilities

help?

5, Weighting

information from sequence order not used

Alignment Reordered Alignment original sequence order shuffled/reordered AIIFLNSALGPSPELFPIILATKVL ASAGPSPPATPLLIIIILLFFNEKV

AIMFLNSALGPPTELFPVILATKVL ASAGPPTPATPLLIMVILLFFNEKV

SIMFLNHTLNPTPELFPIILATETL SHTNPTPPATPLLIMIILLFFNEET

TILFLNSSLGLQPEVTPTVLATKTL TSSGLQPPATPLLILTVLVTFNEKT

TLLFLNSMLKPPSELFPIILATKTL TSMKPPSPATPLLLLIILLFFNEKT

ALLFLNSTLNPPTELFPLILATKTL ASTNPPTPATPLLLLLILLFFNEKT

AILFLNSFLNPPKEFFPIILATKIL ASFNPPKPATPLLILIILFFFNEKI

c! ways to reorder alignment

shuffle by columns & by taxa

6. So, could we use ‘words’ of 2, 3, 4, 5, … letters

7. Alphabet reduction

Recommended