17
Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

Polymorphism Structure of the Human Genome

Gabor T. Marth

Department of BiologyBoston CollegeChestnut Hill, MA 02467

Page 2: Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

Human variation structure is heterogeneous

chromosomal averages

polymorphism density along chromosomes

Page 3: Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

Heterogeneity at the level of distributions

0.0

0

5.0

0

10

.00

15

.00

20

.00

25

.00

30

.00

35

.00

40

.00

4 kb

8 kb

12 kb

16 kb0

0.1

0.2

0.3

0.4

“sparse” “dense”

marker density

“rare” “common”

0

0.05

0.1

1 2 3 4 5 6 7 8 9 10

allele frequenc

y

Page 4: Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

What explains nucleotide diversity?

5

6

7

8

30 33 36 39 42 45 48 51 54

G+C Content [%]

SN

P R

ate

[per

10,

000

bp

]

5

6

7

8

0.3 1.2 2.1 3 3.9 4.8 5.7

CpG Content [%]

SN

P R

ate

[p

er

10,0

00 b

p]

G+C nucleotide content

CpG di-nucleotide content

5

6

7

8

9

10

0 0.5 1 1.5 2 2.5 3 3.5 4

Recombination rate [per Mb]

SN

P R

ate

[per

10,

000

bp

] recombination rate

functional constraints

3’ UTR 5.00 x 10-4

5’ UTR 4.95 x 10-4

Exon, overall 4.20 x 10-4

Exon, coding 3.77 x 10-4

synonymous 366 / 653non-synonymous 287 / 653

Variance is so high that these quantities are poor predictors of nucleotide diversity in local regions hence random processes are likely to govern the basic shape of the genome variation landscape (random) genetic drift

Page 5: Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

Components of drift: Genealogy

present generation

randomly mating population, genealogy evolves in a non-deterministic fashion

Page 6: Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

Components of drift: Mutation

mutation randomly “drift”: die out, go to higher frequency or get fixed

Page 7: Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

Modulators: Changing population size

mutation randomly “drift”: die out, go to higher frequency or get fixed

genetic bottleneck

Page 8: Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

Modulators: Population subdivision

subdivision

subdivision promotes private polymorphisms, and skews allele frequency

Page 9: Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

Modulators: Recombination

accgttatgcaga acagttatgtaga

acagttatgcaga

accgttatgtagaaccgttatgcaga acagttatgtaga

recombination

different nucleotide sites within the same DNA segment no longer share the same genealogy

Page 10: Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

Modulators: Natural selection

negative (purifying) selection

positive selection

the genealogy is no longer independent of (and hence cannot be decoupled from) the mutation process

Page 11: Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

Modeling ancestral processes

“forward simulations” the “Coalescent” process

By focusing on a small sample, complexity of the relevant part of the ancestral process is greatly reduced. There are,

however, limitations.

Page 12: Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

Inferences from variation data

larger population size (N) -> more mutations -> higher diversity (θ)

larger mutation rate (μ) -> more mutations -> higher diversity (θ)

higher diversity -> larger population size OR higher mutation rate(θ = 4Nμ)

Page 13: Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

Ancestral inference: modeling

past

present

stationary expansioncollapse

MD(simulation)

AFS(direct form)

histo

ry

0

0.05

0.1

1 2 3 4 5 6 7 8 9 10

0

0.05

0.1

1 2 3 4 5 6 7 8 9 100

0.05

0.1

1 2 3 4 5 6 7 8 9 10

0

0.05

0.1

1 2 3 4 5 6 7 8 9 10

bottleneck

0

0.1

0.2

0.3

0 1 2 3 4 5 6 7 8 9 100

0.1

0.2

0.3

0 1 2 3 4 5 6 7 8 9 100

0.1

0.2

0.3

0 1 2 3 4 5 6 7 8 9 10

0

0.1

0.2

0.3

0 1 2 3 4 5 6 7 8 9 10

Page 14: Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

Ancestral inference: model fitting

0

0.05

0.1

0.15

1 2 3 4 5 6 7 8 9 10

minor allele count

bottleneckmodest but

uninterrupted expansion

Page 15: Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

Allelic association

accgttatgcaga

acagttatgtaga

acagttatgcaga

accgttatgtaga

possible allele combinations (2-marker

haplotypes)

higher recombination rate

(r)

Page 16: Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

Allelic association: LD

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.81E-6

1E-5

1E-4

1E-3

0.01

0.1

1

10

100

1000

Recom

bin

ation F

raction

r2

European Asian

African American

Dis

tance (k

b)

measure of allelic association: “linkage disequilibrium (LD)”

Page 17: Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

Haplotype structure

“haplotype block”