Introduction to Quantitative Trait Loci Linkage and Association Studies

Preview:

DESCRIPTION

Introduction to Quantitative Trait Loci Linkage and Association Studies. Lon Cardon Wellcome Trust Centre for Human Genetics University of Oxford Pak Sham Institute of Psychiatry King’s College London Stacey Cherny Both, and then some. QTL Mapping: Morning Schedule. - PowerPoint PPT Presentation

Citation preview

Introduction to Quantitative Trait LociLinkage and Association Studies

Lon CardonWellcome Trust Centre for Human Genetics

University of Oxford

Pak ShamInstitute of Psychiatry

King’s College London

Stacey ChernyBoth, and then some

QTL Mapping: Morning Schedule

09.00 – 10.00 Linkage Theory (overview) Sham10.00 – 10.30 Illustrative application Cardon

11.00 – 11.30 Association/Linkage DisequilibriumTheory Sham

11.30 – 12.15 Application Cherny12.15 – 12.30 Interpreting the results Cardon

• F:\lon\fulker_paper99.pdf• Fourteenth International Twin Course (Advanced):

Boulder, Colorado, March 2000

Positional Cloning of Complex Traits

LO

D

Sib pairs Chromosome Region Association Study

Genetics

GenomicsPhysical Mapping/Sequencing

Candidate Gene Selection/Polymorphism Detection

Mutation Characterization/Functional Annotation

Genome Screens for Linkage in Sib-pairs

In 1997/98, > 20 genome screens published using sib-pairs- Diabetes (IDDM + NIDDM)- Asthma- Osteoporosis- Obesity- Multiple Sclerosis- Epilepsy- Inflammatory Bowel Disease- Celiac Disease- Psychiatric Disorders- Behavioral Traits- others...

Scan Rate at least 2-fold greater in 1998/1999

Many more studies of specific loci, candidate gene regions

Disequilibrium Mapping

• 100’s candidate gene studies every year

• Replications rare

• Genome-wide SNP maps expected in late 2001 (300,000 SNPs; ~ 1 SNP/10 kb)

• Applications in epidemiology, drug design, functional assessment, …

n

iiii

n

iicL

1

1

1

)()'(2

1||log

2

1)log( yEyE

Likelihood for Variance Components Applications

where yi is the vector of phenotypes for the ith family,

Ei is a function of polygenic effects, environmental effects, major loci, interactions, etc.,

and

may be used to incorporate a wide range of covariates, including association/disequilibrium parameters.

Lange, Westlake & Spence, AJHG, 1976

Linear Model of Association(Fulker et al, AJHG, 1999)

Biometrical basis

;ijijijij egGy

bbgenotype if

Bbgenotype if 0

BBgenotype if

ij

ij

ij

ij

a

a

G

jif

ji|,yCov(y

gikja

ega

ijkikij if2

1)(

if)

22

222

ijwibij wby

Variance model (linkage)

Means model (association)

Population association is parameterized independent of linkage (unlike TDT)

ijk = proportion of alleles shared ibd at marker2

a = additive genetic variance parameter2

g = polygenic (residual) variance parameter2

e = environmental (residual) variance parameter

Application: ACE• British population• Circulating ACE levels

– Normalized separately for males / females

• 10 di-allelic polymorphisms– 26 kb– Common– In strong Linkage disequilibrium

• Keavney et al, HMG, 1998

Angiotensin-1 Converting Enzyme

Keavney et al. (1999) Hum Mol Gen, 7:1745-1751

Angiotensin-1 Converting EnzymeKeavney et al. (1999) Families

83 extended families4 - 18 members/familyage: 19-90 years

Families ascertained for study of blood pressure

Phenotype: Plasma ACE activity, standardized withingender

No correlation between ACE and SBP or DBP

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

1 2 3 4 5 6 7 8 9 10

Marker

Fre

q A

sso

c A

llele

0.75

0.8

0.85

0.9

0.95

1

1 2 3 4 5 6 7 8 9

Marker(n,n+1)

D'

ACE Markers and Disequilibrium

Data from Keavney et al. (1999) Hum Mol Gen, 7:1745-1751

•F:\lon\2000\linkage.mx

•F:\lon\2000\marker*.mx

Angiotensin Converting EnzymeMarker/IBD Files

Linkage in Sib-pairs

0

5

10

15

20

0 2 4 6 8 10

Marker

Ch

i-sq

uar

ed

Between Pairs Model of Association(Fulker et al, AJHG, 1999)

G1 G2 A1 A2 MeanBB BB a a a BB Bb a 0 a/2BB bb a -a 0 Bb BB 0 a a/2Bb Bb 0 0 0 Bb bb 0 -a -a/2bb BB -a a 0bb Bb -a 0 -a/2bb bb -a -a -a

Genotype Genetic Value

BBBbbb

a0

-a

2a

bb BBBb

Biometrical Model Between Pair Expectations

• Genotype-phenotype associations between pairs may result from allelicassociation or from population substructure

Within Model of Association(Fulker et al, AJHG, 1999)

G1 G2 A1 A2 Diff1 Diff2

BB BB a a 0 0BB Bb a 0 a/2 -a/2BB bb a -a a -aBb BB 0 a -a/2 a/2Bb Bb 0 0 0 0Bb bb 0 -a a/2 -a/2bb BB -a a -a abb Bb -a 0 -a/2 a/2bb bb -a -a 0 0

Genotype Genetic Value

BBBbbb

a0

-a

2a

bb BBBb

Biometrical Model Within Expectations

• Genotype-phenotype associations within pairs unaffected by sampling artifacts• Difference = 0 unless 1 parent heterozygous (cf. TDT)

Parameter Expectations

)/()( rsaDE w

),()/()( kkb pfrsaDE

)1(2)/(22ˆ 22222 RpqapqaDpqpqaa

Leta = additive genetic valueD = disequilibrium coef between q1, m1 alleles [P(m1q1)-P(m1)P(q1)]r = frequency m1 allele (s = 1 – r)p = frequency q1 allele (q = 1 – p)R = correlation between numbered alleles at marker and QTLk = population strata counter

Test of linkage only (typical VC) 2a = 0

Test of substructure: b = w

Powerful test in absence of stratification: a= b+w = 0

Test of linkage in presence of association: 2a = 0 (a free)

Variance Components Association Model- Obvious Uses -

0

0.1

0.2

0.3

0.4

0.5

0.25 0.5 0.75 1

D/Dmax

Po

we

r: L

inka

ge

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Po

we

r: L

D

Linkage

LD

Variance Components Test for Linkage Disequilibrium - Power of Testing Linkage vs LD -

Linkage in Sib-pairs

0

5

10

15

20

0 2 4 6 8 10

Marker

Ch

i-sq

uar

ed

Linkage in the presence of association

0

1

2

3

4

5

6

7

1 2 3 4 5 6 7 8 9 10

Marker

Ch

i-sq

uar

ed

(lin

kag

e|as

soc)

Linkage and Association in Sib Pairs

01234

5678

1 2 3 4 5 6 7 8 9 10

Marker

Lin

kag

e ch

i-sq

uar

ed

0

20

40

60

80

100

Ass

oci

atio

n c

hi-

squ

ared

Evidence for Linkage: Full Sample

0

5

10

LO

D

A-5466C A-240T T1237C I/D 4656(CT)3/2

T-5991C T-3892C T-93C G2215A G2350A

Evidence Against Complete LD: Full Sample

A-5466C A-240T T1237C I/D 4656(CT)3/2

T-5991C T-3892C T-93C G2215A G2350A

0.0

0.5

1.0

1.5

LO

D

Evidence for Association: Full Sample

0

5

10

15

LO

D

A-5466C A-240T T1237C I/D 4656(CT)3/2

T-5991C T-3892C T-93C G2215A G2350A

Drawing Conclusions: Full Sample

A-5466C A-240T T1237C I/D 4656(CT)3/2

T-5991C T-3892C T-93C G2215A G2350A

0

5

10

15

LO

D

0.0

0.5

1.0

1.5

Series2 for Association against Complete LD

ACE Example Summary

• Agrees with haplotype analysis

• Distinguishes complete and incomplete disequilibrium– Measure of distance for incomplete LD– Indicator of trait allele frequencies

• Typical or fairy-tale?

D' Estimates, Oxford ACE Data

0.5

0.6

0.7

0.8

0.9

1

1 2 3 4 5 6 7 8 9 10

Marker

D'

Observed D' with I/D minimum estimated D'

QTL Allele Frequency Estimation

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

1 2 3 4 5 6 7 8 9 10

Marker

QT

L a

llele

fre

q

Useful diagnostics• Fit association and linkage

models separately• Provide indicator of

distance– Minimum D’ (D’min)

• Select next markers– Range for QTL alleles

(pmin, pmax)

Haplotype Analysis

• 3 clades– All common haplotypes

– >90% of all haplotypes

• “B” = “C” – Equal phenotypic effect

– Functional variant on right

• Keavney et al (1998)

TATATTAIA3

TATATCGIA3

TATATTGIA3

CCCTCCGDG2

CCCTCCADG2

TATATCADG2

TACATCADG2

A

B

C

Case/Control Studies: Admixture Consider two case/control samples, A and B, genotyped at a marker with alleles M and m Neither has any association: Sample ‘A’ Sample ‘B’ M m Freq. M m Freq. Affected 50 50 .10 1 9 .01 Unaffected 450 450 .90 99 891 .99 .50 .50 .10 .90 2

1 is n.s. 21 is n.s.

Now consider a single sample comprised of A + B: M m Freq. Affected 51 59 .055 Unaffected 549 1341 .945 .30 .70

21 = 14.84, p < 0.001

…Association can be induced by mixed samples

The Spielman TDT

• Traditional case-control– Compare allele frequencies in two samples

• Cases and controls must be one population

• Heterozygous parents– Parental alleles are the study population– Population allele frequencies fixed

• 50:50, independent of original

– Test for excess among affected offspring

1/2 3/4

1/3

TDT based on (T - NT)2/(T+NT)

Transmission/Disequilibrium Test

• TDT uses only heterozygous parentsConsequence: at different markers with variable allele frequencies, analyses are based on different subsets of overall sample => difficulties for localization

• TDT evaluates linkage in presence of association; ie., joint testConsequence: given positive evidence, cannot distinguish between strong linkage or strong association

• Several sibling-based extensions developed

Family-based Association Methods for Quantitative Traits

Allison, D.B., AJHG, 1997 Selected parent-offspring triosRabinowitz, D. Hum Hered, 1997 Nuclear familiesFulker, D. W. et al. AJHG, 1999 Sib-pairs without parentsElston, R. C. et al. AJHG, 1999 General pedigrees (linkage)Allison, D. B. et al. AJHG, 1999 Sibships with/without parents (linkage)Abecasis, G. et al. AJHG 2000 General pedigrees with/without parentsCardon, L.R. Hum Hered 2000 Sib-pairs with GxE, epistatic interactionsMonks, S. et al. abstract 1999 Nuclear families

Primary aim: association test free of pop. sub-structure effects

Quantitative Genetic Model

2a

bb BBBb

d

midpoint

Genotype Genetic Value

BBBbbb

ad

-a

Simple Association Model

• Fit by linear regression– Phenotype (yij)

– Mean ()

– Number of ‘B’ alleles at marker (gij)

• Evidence for association when a 0

ijijaij gy

ijwibij wbyE )(

Linear Model of Association in Sib-pairs

bi and wij are defined on the basis of the marker genotypei.e., b and w are f(genotype(QTL), genotype(marker),Dmq)

ACE: D’min, pmin and pmax

Expected Actual

T-5991C G2215A I/D G2350A

D’ > 0.78 0.78 0.82 0.85

Minor allele

.15 – .48 .45 - .50 .45 - .50 .45 - .50

Recommended