15
Recombination Histories & Global Pedigrees Acknowledgements Yun Song - Rune Lyngsø - Mike Steel Finding Minimal Recombination Histories 1 2 3 4 1 2 3 4 1 2 3 4 Global Pedigrees Finding Common Ancestors NOW

Recombination Histories & Global Pedigrees

Embed Size (px)

DESCRIPTION

Finding Minimal Recombination Histories. 1. 2. 3. 4. 1. 2. 3. 1. 4. 2. 3. 4. Global Pedigrees. Finding Common Ancestors. NOW. Recombination Histories & Global Pedigrees. Acknowledgements Yun Song - Rune Lyngsø - Mike Steel. Recombination. Gene Conversion. - PowerPoint PPT Presentation

Citation preview

Recombination Histories & Global Pedigrees

Acknowledgements Yun Song - Rune Lyngsø - Mike Steel

Finding Minimal Recombination Histories

1 2 3 4 1 2 3 4 1 234

Global Pedigrees

Fin

din

g

Co

mm

on

A

nc

es

tors

NOW

Basic Evolutionary Events

Recombination Gene Conversion

Coalescent/Duplication Mutation

Infinite site assumption ?

Hudson & Kaplan’s RM

If you equate RM with expected number of recombinations, this could be used as an estimator. Unfortunately, RM is a gross underestimate of the real number of recombinations.

0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 10 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 11 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 11 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1

Local Inference of Recombinations

0000111

0001101

00

10

01

11

Four combinationsIncompatibility:

Myers-Griffiths (2002): Number of Recombinations in a sample, NR, number of types, NT, number of mutations, NM obeys:

NR NT NM 1

0011

0101

T . . . GT . . . CA . . . GA . . . C

Recoding

•At most 1 mutation per column

•0 ancestral state, 1 derived state

Minimal Number of Recombinations

Last Local Tree Algorithm:

L21Data

2

n

i-1 i

1

Trees

The Kreitman data (1983): 11 sequences, 3200bp, 43(28) recoded, 9 different

How many neighbors?

(2n 2)!

2n 1(n 1)!

n! (n 1)!

2n 1

14133 2 nn

~ n3

Bi-partitionsHow many local trees?

• Unrooted

• Coalescent

Metrics on Trees based on subtree transfers.

Pretending the easy problem (unrooted) is the real problem (age ordered), causes violation of the triangle inequality:

Tree topologies with age ordered internal nodes

Rooted tree topologies

Unrooted tree topologies

Trees including branch lengths

Observe that the size of the unit-neighbourhood of a tree does not grow nearly as fast as the number of trees

Song (2003+)

Du

e to Yu

n S

ong

Tree Combinatorics and Neighborhoods

(2n 3)!!(2n 2)!

2n 1(n 1)!

Allen & Steel (2001)

2(n 3)(2n 7)

14133 2 nn

2

12

2 )1(log2)2(4n

m

mn

n! (n 1)!

2n 1

1

32n3 3n2 20n 39

1

23

4

56

7

Methods # of rec events obtained

Hudson & Kaplan (1985) 5

Myers & Griffiths (2003) 6

Song & Hein (2004). Set theory based approach. 7

Song & Hein (2003). Tree scanning using DP

Lyngsø, Song & Hein (2006). Massive Acceleration using Branch and Bound Algorithm.

Lyngsø, Song & Hein (2006). Minimal number of Gene Conversions (in prep.)

7

7

5-2/6-1

The Minimal Recombination History for the Kreitman Data

- recombination 27 ACs

0

1

2

3

4

5

6

7

8

1

1

4

2

5

3

1

5

5

The Griffiths-Ethier-Tavare Recursions

No recombination: Infinite Site Assumption

Ancestral State Known

History Graph: Recursions Exists

No cycles

Possible Histories without Recombination for simple data example

+ recombination 3*108 ACs

1st

2nd

Ancestral configurations to 2 sequences with 2 segregating sites

mid-point heuristic

Counting + Branch and Bound Algorithm

?

Exact len

gth

Lower bound

Up

per B

oun

d

0 31 912 13143 86184 304365 627946 789707 630498 324519 1046710 1727

289920

k-recom

bin

atination

n

eighb

orhood

k

minARGs: Recombination Events & Local Trees

True ARG

Reconstructed ARG

1 2 3 4 5

1 23 4 5

((1,2),(1,2,3))

((1,3),(1,2,3))

n=7, =10, =75

Minimal ARG

True ARG

0 4 Mb

Hudson-Kaplan

Myers-Griiths

Song-Hein

n=8, =40

n=8, =15

Mutation information on only one side

Mutation information on both sides

Reconstructing global pedigrees: SuperpedigreesSteel and Hein, 2005

The gender-labeled pedigrees for all pairs, defines global pedigree

k

Gender-unlabeled pedigrees doesn’t!!

•All embedded phylogenies are observable

•Do they determine the pedigree?

Genomes with and --> infinity recombination rate, mutation rate

Benevolent Mutation and Recombination Process

Counter example: Embedded phylogenies: