Lecture3_Jul20

Preview:

DESCRIPTION

the third

Citation preview

203.343 Advanced Genetics

and Genomics

Lecture 3

July 20th 2015

Olin Silander

Gene Linkage and Linkage Disequilibrium

Gene Linkage and Linkage Disequilibrium

Describe the coalescent and the neutral model of evolution

Gene Linkage and Linkage Disequilibrium

Hardy Weinberg equilibrium

Gene Linkage and Linkage Disequilibrium

Hardy Weinberg equilibrium

Two alleles:

a 40%A 60%

AA Aa aa ? ? ?

Gene Linkage and Linkage Disequilibrium

Hardy Weinberg equilibrium

Two alleles:

a 40%A 60%

a

40%

Gene Linkage and Linkage Disequilibrium

Hardy Weinberg equilibrium

Two alleles:

a 40%A 60%

a

40%

a

40%

aa 16%

Gene Linkage and Linkage Disequilibrium

Hardy Weinberg equilibrium

Two alleles:

a 40%A 60%

a

40%

a

40%

aa 16%

A

40%

a

60%

Aa 48%

Gene Linkage and Linkage Disequilibrium

Hardy Weinberg equilibrium

Two alleles:

a 40%A 60%

a

40%

a

40%

aa16%

A

40%

a

60%

A

60%

A

60%

Aa 48%

AA 36%

Gene Linkage and Linkage Disequilibrium

Hardy Weinberg equilibrium

What are the assumptions of HWE?

(1) diploid (2) non-overlapping generations (3) sexual reproduction (4) random mating (5) infinite population size(6) no selection

Gene Linkage and Linkage Disequilibrium

Hardy Weinberg equilibrium

What are the assumptions of HWE?

infinite population size

aa 16%

Aa 48%

AA 36%

Gene Linkage and Linkage Disequilibrium

Hardy Weinberg equilibrium

What are the assumptions of HWE?

finite population size (drift)

aa 16%

Aa 48%

AA 36%

30% 40% 30%

infinite population size

Gene Linkage and Linkage Disequilibrium

Hardy Weinberg equilibrium

What are the assumptions of HWE?

finite population size (drift)

aa 16%

Aa 48%

AA 36%

30% 40% 30%

selection13%10%

77%

Gene Linkage and Linkage Disequilibrium

Hardy Weinberg equilibriumTwo alleles

Gene Linkage and Linkage Disequilibrium

Hardy Weinberg equilibriumTwo alleles

LinkageTwo alleles, two loci

Gene Linkage and Linkage Disequilibrium

Hardy Weinberg equilibriumTwo alleles

LinkageTwo alleles, two loci

A

B

A

b

a

b

a

B

Gene Linkage and Linkage Disequilibrium

LinkageTwo alleles, two loci

A

B

A

b

a

b

a

B

A 60%a 40%B 80%b 20%

Gene Linkage and Linkage Disequilibrium

A

B

A

b

a

b

a

B

48% 12% 8% 32%

LinkageTwo alleles, two lociA 60%

a 40%B 80%b 20%

Gene Linkage and Linkage Disequilibrium

A

B

A

b

a

b

a

B

48% 12% 8% 32%

Linkage equilibrium

LinkageTwo alleles, two lociA 60%

a 40%B 80%b 20%

Gene Linkage and Linkage Disequilibrium

A

B

A

b

a

b

a

B

60% 0% 20% 20%

Linkage disequilibrium

LinkageTwo alleles, two lociA 60%

a 40%B 80%b 20%

Gene Linkage and Linkage Disequilibrium

A 60%a 40%B 80%b 20%

Linkage disequilibrium

LinkageTwo alleles, two loci

fAB = 0.60 fAb = 0 faB = 0.20 fab = 0.20

Gene Linkage and Linkage Disequilibrium

A 60%a 40%B 80%b 20%

Linkage equilibrium

Linkage disequilibrium

LinkageTwo alleles, two loci

fAB = 0.48 faB = 0.32 fAb = 0.12 fab = 0.08

fAB = 0.60 fab = 0.20fAb = 0 faB = 0.20

Gene Linkage and Linkage Disequilibrium

A 60%a 40%B 80%b 20%

Linkage equilibrium

Linkage disequilibrium

LinkageTwo alleles, two loci

fAB = 0.48 faB = 0.32 fAb = 0.12 fab = 0.08

fAB - fAfB = 0

fAB = 0.60 fab = 0.20fAb = 0 faB = 0.20

Gene Linkage and Linkage Disequilibrium

A 60%a 40%B 80%b 20%

Linkage equilibrium

Linkage disequilibrium

LinkageTwo alleles, two loci

fAB = 0.48 faB = 0.32 fAb = 0.12 fab = 0.08

fAB - fAfB = 0

fAB - fAfB = ?

fAB = 0.60 fab = 0.20fAb = 0 faB = 0.20

Gene Linkage and Linkage Disequilibrium

A 60%a 40%B 80%b 20%

Linkage disequilibrium

LinkageTwo alleles, two loci

fAB = 0.60 fab = 0.20

fAB - fAfB = DQuantify LD

fAb = 0 faB = 0.20

Gene Linkage and Linkage Disequilibrium

A 60%a 40%B 80%b 20%

LinkageTwo alleles, two loci

fAB - fAfB = DQuantify LD

Gene Linkage and Linkage Disequilibrium

Linkage equilibrium

Quantifying LDA 60%a 40%B 80%b 20%

fAB - fAfB = D = 0

fAB - fAfB = DQuantify LD

Gene Linkage and Linkage Disequilibrium

Quantifying LDA 60%a 40%B 80%b 20%

Linkage disequilibrium

fAB = 0.60 fab = 0.20

fAB - fAfB = DQuantify LD

fAb = 0 faB = 0.20

Gene Linkage and Linkage Disequilibrium

Quantifying LDA 60%a 40%B 80%b 20%

fAB - fAfB = D = 0.12Linkage disequilibrium

fAB = 0.60 fab = 0.20

fAB - fAfB = DQuantify LD

fAb = 0 faB = 0.20

Gene Linkage and Linkage Disequilibrium

Quantifying LDA 60%a 40%B 80%b 20%

What is the maximum value of D?

fAB - fAfB = D = 0.12Linkage disequilibrium

fAB = 0.60 fab = 0.20

fAB - fAfB = DQuantify LD

fAb = 0 faB = 0.20

Gene Linkage and Linkage Disequilibrium

Quantifying LDA 60%a 40%B 80%b 20%

fAB - fAfB = DQuantify LD

D’: scaled to be between 0 and 1

Gene Linkage and Linkage Disequilibrium

Quantifying LDA 60%a 40%B 80%b 20%

fAB - fAfB = DQuantify LD

if D > 0 D’ =

min(fAfb, fafB )D

if D < 0 D’ =

max(-fAfB, -fafb )D

D’: scaled to be between 0 and 1

Gene Linkage and Linkage Disequilibrium

Quantifying LDA 60%a 40%B 80%b 20%

fAB - fAfB = DQuantify LD

if D > 0 D’ =

min(fAfb, fafB )D

if D < 0 D’ =

max(-fAfB, -fafb )D

D’: scaled to be between 0 and 1

fAB - fAfB = D = 0.12Linkage disequilibrium

fAB = 0.60 fab = 0.20fAb = 0 faB = 0.20

Gene Linkage and Linkage Disequilibrium

Quantifying LDA 60%a 40%B 80%b 20%

fAB - fAfB = DQuantify LD

if D > 0 D’ =

min(fAfb, fafB )D

if D < 0 D’ =

max(-fAfB, -fafb )D

D’: scaled to be between 0 and 1

fAB - fAfB = D = ?Linkage disequilibrium

fAB = 0.55 fab = 0.15fAb = 0.05 faB = 0.25

Gene Linkage and Linkage Disequilibrium

Quantifying LDA 60%a 40%B 80%b 20%

fAB - fAfB = DQuantify LD

if D > 0 D’ =

min(fAfb, fafB )D

if D < 0 D’ =

max(-fAfB, -fafb )D

D’: scaled to be between 0 and 1

fAB - fAfB = D = 0.07Linkage disequilibrium

fAB = 0.55 fab = 0.15fAb = 0.05 faB = 0.25

Gene Linkage and Linkage Disequilibrium

Quantifying LD Ind. locAA

locB1 A A2 A A3 A A4 A A5 A A6 A A7 A C8 A A9 A A10 A A11 T C12 T C13 T C14 T A15 T C16 T C17 T C18 T C19 T C20 T C

Ind. locAA

locB1 A C2 A A3 A C4 A C5 A C6 A A7 A A8 A A9 A C10 A A11 T C12 T C13 T C14 T A15 T A16 T A17 T A18 T A19 T C20 T C

population 2population 1

Gene Linkage and Linkage Disequilibrium

Quantifying LD Ind. locA locB1 A A2 A A3 A A4 A A5 A A6 A A7 A C8 A A9 A A10 A A11 T C12 T C13 T C14 T A15 T C16 T C17 T C18 T C19 T C20 T C

Ind. locAA

locB1 A C2 A A3 A C4 A C5 A C6 A A7 A A8 A A9 A C10 A A11 T C12 T C13 T C14 T A15 T A16 T A17 T A18 T A19 T C20 T C

fAB - fAfB = D

population 2population 1

Gene Linkage and Linkage Disequilibrium

Quantifying LD Ind. locA locB1 A A2 A A3 A A4 A A5 A A6 A A7 A C8 A A9 A A10 A A11 T C12 T C13 T C14 T A15 T C16 T C17 T C18 T C19 T C20 T C

Ind. locAA

locB1 A C2 A A3 A C4 A C5 A C6 A A7 A A8 A A9 A C10 A A11 T C12 T C13 T C14 T A15 T A16 T A17 T A18 T A19 T C20 T C

A aA T==locus A

fAB - fAfB = D

population 2population 1

Gene Linkage and Linkage Disequilibrium

Quantifying LD Ind. locA locB1 A A2 A A3 A A4 A A5 A A6 A A7 A C8 A A9 A A10 A A11 T C12 T C13 T C14 T A15 T C16 T C17 T C18 T C19 T C20 T C

Ind. locAA

locB1 A C2 A A3 A C4 A C5 A C6 A A7 A A8 A A9 A C10 A A11 T C12 T C13 T C14 T A15 T A16 T A17 T A18 T A19 T C20 T C

A aA T==locus A

B b==locus B

C A

fAB - fAfB = D

population 2population 1

Gene Linkage and Linkage Disequilibrium

Quantifying LD Ind. locA locB1 A A2 A A3 A A4 A A5 A A6 A A7 A C8 A A9 A A10 A A11 T C12 T C13 T C14 T A15 T C16 T C17 T C18 T C19 T C20 T C

Ind. locAA

locB1 A C2 A A3 A C4 A C5 A C6 A A7 A A8 A A9 A C10 A A11 T C12 T C13 T C14 T A15 T A16 T A17 T A18 T A19 T C20 T C

A aA T==locus A

B b==locus B

C A

0.25 - 0.5 x 0.5=0

fAB - fAfB = D

population 2population 1

Gene Linkage and Linkage Disequilibrium

Quantifying LD Ind. locA locB1 A A2 A A3 A A4 A A5 A A6 A A7 A C8 A A9 A A10 A A11 T C12 T C13 T C14 T A15 T C16 T C17 T C18 T C19 T C20 T C

Ind. locAA

locB1 A C2 A A3 A C4 A C5 A C6 A A7 A A8 A A9 A C10 A A11 T C12 T C13 T C14 T A15 T A16 T A17 T A18 T A19 T C20 T C

A aA T==locus A

B b==locus B

C A

0.25 - 0.5 x 0.5=00.05 - 0.5 x 0.5=-0.2

fAB - fAfB = D

population 2population 1

Gene Linkage and Linkage Disequilibrium

Quantifying LD Ind. locA locB1 A A2 A A3 A A4 A A5 A A6 A A7 A C8 A A9 A A10 A A11 T C12 T C13 T C14 T A15 T C16 T C17 T C18 T C19 T C20 T C

Ind. locAA

locB1 A C2 A A3 A C4 A C5 A C6 A A7 A A8 A A9 A C10 A A11 T C12 T C13 T C14 T A15 T A16 T A17 T A18 T A19 T C20 T C

A aA T==locus A

B b==locus B

C A

0.25 - 0.5 x 0.5=00.05 - 0.5 x 0.5=-0.2

fAb - fAfb = D0.45 - 0.5 x 0.5=0.2

fAB - fAfB = D

population 2population 1

Gene Linkage and Linkage Disequilibrium

Quantifying LD Ind. locA locB1 A A2 A A3 A A4 A A5 A A6 A A7 A C8 A A9 A A10 A A11 T C12 T C13 T C14 T A15 T C16 T C17 T C18 T C19 T C20 T C

Ind. locAA

locB1 A C2 A A3 A C4 A C5 A C6 A A7 A A8 A A9 A C10 A A11 T C12 T C13 T C14 T A15 T A16 T A17 T A18 T A19 T C20 T C

A aA T==locus A

B b==locus B

C A

0.25 - 0.5 x 0.5=00.05 - 0.5 x 0.5=-0.2

0.45 - 0.5 x 0.5=0.2

fab - fafb = D0.05 - 0.5 x 0.5=-0.2

faB - fafB = D0.45 - 0.5 x 0.5=0.2

fAB - fAfB = D

fAb - fAfb = D

population 2population 1

Gene Linkage and Linkage Disequilibrium

Quantifying LD Ind. locA locB1 A A2 A A3 A A4 A A5 A A6 A A7 A C8 A A9 A A10 A A11 T C12 T C13 T C14 T A15 T C16 T C17 T C18 T C19 T C20 T C

Ind. locAA

locB1 A C2 A A3 A C4 A C5 A C6 A A7 A A8 A A9 A C10 A A11 T C12 T C13 T C14 T A15 T A16 T A17 T A18 T A19 T C20 T C

A aA T==locus A

B b==locus B

C A

0.25 - 0.5 x 0.5=00.05 - 0.5 x 0.5=-0.2

The choice of alleles does not matter the absolute value of D is always the same

fAB - fAfB = D

population 2population 1

Gene Linkage and Linkage Disequilibrium

Quantifying LD Ind. locA locB1 A A2 A A3 A A4 A A5 A A6 A A7 A C8 A A9 A A10 A A11 T C12 T C13 T C14 T A15 T C16 T C17 T C18 T C19 T C20 T C

Ind. locAA

locB1 A C2 A A3 A C4 A C5 A C6 A A7 A A8 A A9 A C10 A A11 T C12 T C13 T C14 T A15 T A16 T A17 T A18 T A19 T C20 T C

A aA T==locus A

B b==locus B

C A

0.25 - 0.5 x 0.5=00.05 - 0.5 x 0.5=-0.2

fAB - fAfB = D

population 2population 1

Gene Linkage and Linkage Disequilibrium

Quantifying LD Ind. locA locB1 A A2 A A3 A A4 A A5 A A6 A A7 A C8 A A9 A A10 A A11 T C12 T C13 T C14 T A15 T C16 T C17 T C18 T C19 T C20 T C

Ind. locAA

locB1 A C2 A A3 A C4 A C5 A C6 A A7 A A8 A A9 A C10 A A11 T C12 T C13 T C14 T A15 T A16 T A17 T A18 T A19 T C20 T C

A aA T==locus A

B b==locus B

C A

0.25 - 0.5 x 0.5=00.05 - 0.5 x 0.5=-0.2

fAB - fAfB = D

if D < 0 D’ =

max(-fAfB, -fafb )D

D’ = 0.8

population 2population 1

Gene Linkage and Linkage Disequilibrium

Quantifying LD Ind. locAA

locB1 A A2 A A3 A A4 A A5 A A6 A A7 A C8 A A9 A A10 A A11 T C12 T C13 T C14 T A15 T C16 T C17 T C18 T C19 T C20 T C

Ind. locAA

locB1 A C2 A A3 A C4 A C5 A C6 A A7 A A8 A A9 A C10 A A11 T C12 T C13 T C14 T A15 T A16 T A17 T A18 T A19 T C20 T C

population 2: if we see an A

population 2population 1

Gene Linkage and Linkage Disequilibrium

Quantifying LD Ind. locAA

locB1 A A2 A A3 A A4 A A5 A A6 A A7 A C8 A A9 A A10 A A11 T C12 T C13 T C14 T A15 T C16 T C17 T C18 T C19 T C20 T C

population 2: if we see an A

90% chance there is an A at locus 2

Ind. locAA

locB1 A C2 A A3 A C4 A C5 A C6 A A7 A A8 A A9 A C10 A A11 T C12 T C13 T C14 T A15 T A16 T A17 T A18 T A19 T C20 T C

population 2population 1

Gene Linkage and Linkage Disequilibrium

Quantifying LD Ind. locAA

locB1 A A2 A A3 A A4 A A5 A A6 A A7 A C8 A A9 A A10 A A11 T C12 T C13 T C14 T A15 T C16 T C17 T C18 T C19 T C20 T C

population 2: if we see an A

90% chance there is an A at locus 2

r

Ind. locAA

locB1 A C2 A A3 A C4 A C5 A C6 A A7 A A8 A A9 A C10 A A11 T C12 T C13 T C14 T A15 T A16 T A17 T A18 T A19 T C20 T C

population 2population 1

Gene Linkage and Linkage Disequilibrium

Quantifying LD Ind. locAA

locB1 A A2 A A3 A A4 A A5 A A6 A A7 A C8 A A9 A A10 A A11 T C12 T C13 T C14 T A15 T C16 T C17 T C18 T C19 T C20 T C

population 2: if we see an A

90% chance there is an A at locus 2

r =

Ind. locAA

locB1 A C2 A A3 A C4 A C5 A C6 A A7 A A8 A A9 A C10 A A11 T C12 T C13 T C14 T A15 T A16 T A17 T A18 T A19 T C20 T C

population 2population 1

D√ fAfBfafb

Gene Linkage and Linkage Disequilibrium

Quantifying LD

r = D√ fAfBfafb

A 60%a 40%B 80%b 20%

if D > 0 D’ =

min(fAfb, fafB )D

if D < 0 D’ = max(-fAfB, -fafb )

D

D = fAB - fAfB =

Linkage disequilibrium

D’

fAB = 0.55 fab = 0.15fAb = 0.05 faB = 0.25

= =

Gene Linkage and Linkage Disequilibrium

Quantifying LD

r = D√ fAfBfafb

A 60%a 40%B 80%b 20%

if D > 0 D’ =

min(fAfb, fafB )D

if D < 0 D’ = max(-fAfB, -fafb )

D

D = fAB - fAfB = 0.07

Linkage disequilibrium

D’

fAB = 0.55 fab = 0.15fAb = 0.05 faB = 0.25

= 0.07/0.12 = 0.070.196

Gene Linkage and Linkage Disequilibrium

Quantifying LD

locus 1

1 2 3 4 5 6 7 8 9 10 1112 13 14 15 16 17 18 19 20

Gene Linkage and Linkage Disequilibrium

Quantifying LD

AA AA A AA AA AA ATTT TTTT T

locus 1A

T

1 2 3 4 5 6 7 8 9 10 1112 13 14 15 16 17 18 19 20

Gene Linkage and Linkage Disequilibrium

Quantifying LD

AA AA A AA AA AA ATTT TTTT T

locus 1A

T

1 2 3 4 5 6 7 8 9 10 1112 13 14 15 16 17 18 19 20C CC C CCC CCTTT TTTT T

T

C

1 2 3 4 5 6 7 8 9 10 1112 13 14 15 16 17 18 19 20

locus 2

C C C

Gene Linkage and Linkage Disequilibrium

Quantifying LD

AA AA A AA AA AA ATTT TTTT T

locus 1A

T

1 2 3 4 5 6 7 8 9 10 1112 13 14 15 16 17 18 19 20C CC C CCC CCTTT TTTT T

T

C

1 2 3 4 5 6 7 8 9 10 1112 13 14 15 16 17 18 19 20

locus 2

Both loci have the same demographic history

C C C

Complete linkage disequilibrium

Gene Linkage and Linkage Disequilibrium

Quantifying LD

AA AA A AA AA AA ATTT TTTT T

locus 1A

T

1 2 3 4 5 6 7 8 9 10 1112 13 14 15 16 17 18 19 20C CC C CCC CCTTT TTTT T

T

C

123 456 7 8910 1112 13 14 1516 17181920

locus 2

Both loci have the same demographic history

C C C

Complete linkage disequilibrium

Gene Linkage and Linkage Disequilibrium

Quantifying LD

AA AA A AA AA AA ATTT TTTT T

locus 1A

T

1 2 3 4 5 6 7 8 9 10 1112 13 14 15 16 17 18 19 20C CC C CCC CCTTT TTTT T

T

C

123 456 7 8910 1112 13 14 1516 17181920

locus 2

Both loci have different demographic histories

C C C

Partial linkage disequilibrium

Gene Linkage and Linkage Disequilibrium

Quantifying LD

D non-random associations

between AB alleles

D’ D normalized by allele frequencies

r correlation (D normalized

by allele frequencies)

Gene Linkage and Linkage Disequilibrium

Why do different loci have different demographic histories?

Gene Linkage and Linkage Disequilibrium

Why do different loci have different demographic histories?

Recombination

Gene Linkage and Linkage Disequilibrium

Why do different loci have different demographic histories?

Recombination

Gene Linkage and Linkage Disequilibrium

Why do different loci have the same demographic histories?

Gene Linkage and Linkage Disequilibrium

Why do different loci have the same demographic histories?

selectionpopulation size (bottleneck)

(lack of) recombination

population admixture (migration)

Gene Linkage and Linkage Disequilibrium

Why do different loci have the same demographic histories?

selectionpopulation size (bottleneck)

(lack of) recombination

population admixture (migration)