Upload
dwight-maxwell
View
213
Download
0
Embed Size (px)
Citation preview
QTL mapping in animals
QTL mapping in animals
• It works
QTL mapping in animals
• It works
• It’s cheap
QTL mapping in animals
• It works
• It’s cheap
• It’s relevant to human studies
Genomic resource
Nature December 5 2002
No more crosses?
In silico mapping
Method
We wanted to determine whether chromosomal regions regulating quantitative traits (QTL intervals) could be computationally predicted with the use of the mSNP database and available phenotypic information on inbred strains.
Method
Using the allelic distributions across inbred strains contained in the mSNP database, the computational method calculates genotypic distances between loci for a pair of mouse strains. These genotypic distances are then compared with phenotypic differences between the two mouse strains. The process is repeated for all mouse strain pairs for which phenotypic information is available. Lastly, a correlation value is derived using linear regression on the phenotypic and genotypic distances for each genomic locus.
Recombinant InbredsF0 Parental Generation
F1 Generation
F2 Generation
Interbreeding for approximately 20 generations to produce recombinant inbreds
RI strain phenotypes
RI strain genotypesChr1 D1Byu4 B B D B B D D BChr1 D1Rik100 B B D B B D D BChr1 D1Rik101 B B D . D . B BChr1 D1Rik102 B B D D D B D DChr1 D1Rik103 B B D D B B D BChr1 D1Rik104 B B D B B D D BChr1 D1Rik86 B B D B B . D BChr1 D1Rik87 B B D B B D D BChr1 D1Rik88 B B D . B D D BChr1 D1Rik89 B B D B B D D BChr1 D1Rik90 B B D D B B D BChr1 D1Rik91 B B D D B B D BChr1 D1Rik92 B B D D B B D BChr1 D1Rik94 B B D D B B D BChr1 D1Rik95 B B D D B B D BChr1 D1Rik96 B B D D B B D BChr1 D1Rik97 B B D B B D D BChr1 D1Rik98 B B D B B D D BChr1 D1Rik99 B B D B B D D BChr1 D1Hgu1 B B D D D B B DChr1 Ugt1a1-rs1B B D D D B B ?Chr1 D1Mit294 B B D D D B D DChr1 D1Mit1 . . . D D B D DChr1 D1Mit67 B B B D D B D DChr1 D1Rp2 B B B D D B D DChr1 D1Mit231 D B B D D B D DChr1 Odc-rs10 B B* D D D B D DChr1 D1J2 D B B D D B D DChr1 D1Mit211 D B B D D B D DChr1 D1Nds4 D B B D D B D D
QTL for airway responsiveness
Power
n -2 = (t + t)2/(s2QTL/s2RES)
t and t are values on the t distribution corresponding to the desired value
s2QTL is the phenotypic variance explained by a QTL
s2RES the unexplained variance.
Number Power QTL Effect10 90 6110 50 674 90 884 50 83
Experimentally verified QTL for airway responsiveness
Chromosome LOD %Varexp
9 2.5 5.2
10 3.8 8.3
11 3.65 7.5
17 2.1 4.4
Zhang, Y. et al. A genome-wide screen for asthma-associated quantitative trait loci in a mouse model of allergic asthma. Hum. Mol. Genet. 8, 601-605 (1999).
Inbred Strain Cross
Quantitative Trait Locus Detection
Marker QTL
M
m
Q
q
r
Marker QTL
M
m
Q
q
r
MM QQ Qq qq
Mm QQ Qq qq
mm QQ Qq qq
Marker QTL
MM QQ
Mm QQ
Mm QQ
P (QQ | MM) = (1-r)2
P (Qq | MM) = 2r(1-r)
P (qq | MM) = r2
(1-r)2 + 2r(1-r) + r2
QTL Genotypic values
Alleles at the QTL: q and QAdditive value: aDegree of dominance: d
QQ = + 2aQq = + a(1+d)qq =
Mean values for marker genotypes
Marker alleles: M and m Recombination frequency between QTL and marker: r MM = + 2a(1-r)2 + 2r (1- r)(1+d)a
Mean values for marker genotypes
Marker alleles: M and m Recombination frequency between QTL and marker: r MM = + 2a(1-r)2 + 2r (1- r)(1+d)a Mm = + 2ar(1-r) +(1-2r(1- r))(1+d)a mm = + 2ar2 + 2r (1- r)(1 + d)a
Two things follow
• Contrasts of single marker means can be used to detect QTL
r = 0.1 (1-r)2
+ 2r (1- r)MM = + 2a * 0.81 0.18 (1+d) * a
QTLeffects.xls
Example
r = 0.1 (1-r)2 + 2r (1- r)MM = + 2a * 0.81 0.18 (1+d) * a
r(1-r) + (1-2r(1- r))Mm = + 2a * 0.09 0.82 (1+d) * a
r2 + 2r(1-r)mm = + 2a * 0.01 0.18 (1+d) * a
r = 0.5 (1-r)2 + 2r (1- r)MM = + 2a * 0.25 0.5 (1+d) * a
r(1-r) + (1-2r(1- r))Mm = + 2a * 0.25 0.5 (1+d) * a
r2 + 2r(1-r)mm = + 2a * 0.25 0.5 (1+d) * a
Example
REAL_DATA/Real data.xls
Two things follow
• Contrasts of single marker means can be used to detect QTL
• Estimates of position and effect are confounded
Additive and dominance estimates
Additive effect (MM -mm)/2 = (1-2r) * a Dominance effect Mm – (MM + mm)/2) / ((MM - mm)/2) = d * (1-2r)
Flanking markers
M1
m1
M2
m2
Flanking markersM1
m1
M2
m2
M1M1 M2M2M1M1 M2m2M1M1 m2m2
M1m1 M2M2M1m1 M2m2M1m1 m2m2
m1m1 M2M2m1m1 M2m2m1m1 m2m2
Interval mapping
M1
m1
M2
m2
Q
q
r1 r2
r12
Interval mapping
M1
m1
M2
m2
Q
q
r1 r2
r12
r2 =( r12 – r1)/(1-2r1) No interference
r2 = r12- r1 Complete interference
Interval mapping
M1M1 M2M2
M1
m1
M2
m2
Q
q
r1 r2
r12
p(M1QM2 | M1QM2) = ((1-r1) (1-r2)/2)2
Interval mapping
M1
m1
M2
m2
Q
q
r1 r2
r12
p(QQ|M1M1M2M2) = ((1-r1) 2(1-r2)2)/(1-r12)2
p(Qq|M1M1M2M2) = (2r1r2(1-r1) (1-r2) )/(1-r12)2
p(qq|M1M1M2M2) = (r1 2r22)/(1-r12)2
Significance thresholds
Permutation tests to establish thresholds
Empirical threshold values for quantitative trait mappingGA Churchill and RW Doerge
Genetics, 138, 963-971 1994
An empirical method is described, based on the concept of a permutation test, for estimating threshold values that are tailored to the experimental data at hand.
Permutation tests
Trait values are randomly reassigned to genotypes
10,000 re-samplings for 1% value
Permutation tests
• Robust to departures from normality
• Robust to missing or erroneous data
• Easy to implement
Significance Thresholds
Suggestive Significant Mapping method P LOD P LOD Backcross 3.40E-03 1.9 1.00E-04 3.3 Intercross (2 df) 1.60E-03 2.8 5.20E-05 4.3
Lander, E. Kruglyak, L. Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results Nature Genetics. 11, 241-7, 1995
Maximum likelihood methods
Marker genotype M Phenotypic value z Variance 2 Mean Qk
Maximum likelihood methods
L (z | MM) = (1-r)2 zQQ,2) + 2r ( 1-r) zQq,2) + r2 zqq,2) L (z | Mm) = r(1-r) zQQ,2) + (( 1-r)2+ r2)zQq,2) + r(1-r) zqq,2) L (z | mm) = r2 zQQ,2) + 2r ( 1-r) zQq,2) + (1-r)2 zqq,2)
Maximum likelihood methods
L (z | MM) = (1-r)2 zQQ,2) + 2r ( 1-r) zQq,2) + r2 zqq,2) L (z | Mm) = r(1-r) zQQ,2) + (( 1-r)2+ r2)zQq,2) + r(1-r) zqq,2) L (z | mm) = r2 zQQ,2) + 2r ( 1-r) zQq,2) + (1-r)2 zqq,2)
Interval mapping
M1
M1
M2
M2
Q
q
r1 r2
r12
Interval mapping
M1
M1
M2
M2
Q
q
r1 r2
r12
L (z | M1M1M2M2) = ((1-r1)2 (1-r2)
2 )/(1-r12)2zQQ,2) +
2r1r2 ( 1-r1) (1-r2)/(1-r12)
2 zQq,2) + (r1
2r22)/(1-r12)
2 zqq,2)
Maximum likelihoodTest statistic
LR = -2 ln (max Lr(z)/max L(z)) Lr(z) is maximum of the likelihood function under the null hypothesis of no segregating QTL (i.e. that the phenotypic distribution is a single normal)
Example
---------------------------------------| D2MIT21-D2MIT22 37.0 cM 0.0 0.294 -0.071 4.6% 5.069 | ************* 2.0 0.317 -0.074 5.3% 5.455 | ************** 4.0 0.341 -0.077 6.2% 5.861 | **************** 6.0 0.365 -0.077 7.0% 6.279 | ****************** 8.0 0.389 -0.076 8.0% 6.701 | ******************* 10.0 0.410 -0.073 8.9% 7.114 | ********************* 12.0 0.431 -0.068 9.7% 7.505 | *********************** 14.0 0.447 -0.061 10.5% 7.861 | ************************ 16.0 0.460 -0.054 11.0% 8.169 | ************************* 18.0 0.468 -0.046 11.4% 8.417 | ************************** 20.0 0.473 -0.039 11.7% 8.595 | *************************** 22.0 0.473 -0.033 11.6% 8.699 | *************************** 24.0 0.469 -0.026 11.4% 8.728 | *************************** 26.0 0.460 -0.020 11.0% 8.684 | *************************** 28.0 0.447 -0.015 10.4% 8.574 | *************************** 30.0 0.432 -0.009 9.6% 8.409 | ************************** 32.0 0.414 -0.004 8.8% 8.199 | ************************* 34.0 0.394 0.000 8.0% 7.956 | ************************ 36.0 0.373 0.004 7.1% 7.694 | *********************** ---------------------------------------| D2MIT22-D2MIT23 32.9 cM 0.0 0.363 0.006 6.7% 7.563 | *********************** 2.0 0.381 0.010 7.4% 7.705 | *********************** 4.0 0.399 0.013 8.2% 7.811 | ************************ 6.0 0.414 0.016 8.8% 7.867 | ************************ 8.0 0.425 0.019 9.3% 7.862 | ************************ 10.0 0.433 0.021 9.7% 7.786 | ************************ 12.0 0.438 0.021 9.9% 7.631 | *********************** 14.0 0.437 0.022 9.8% 7.394 | ********************** 16.0 0.431 0.023 9.6% 7.077 | ********************* 18.0 0.421 0.022 9.1% 6.684 | ******************* 20.0 0.405 0.019 8.4% 6.229 | ***************** 22.0 0.385 0.015 7.6% 5.726 | *************** 24.0 0.360 0.008 6.6% 5.196 | ************* 26.0 0.333 -0.002 5.6% 4.662 | *********** 28.0 0.303 -0.013 4.6% 4.146 | ********* 30.0 0.274 -0.026 3.8% 3.669 | ******* 32.0 0.246 -0.037 3.1% 3.244 | ***** ---------------------------------------| D2MIT23-D2MIT24 43.5 cM 0.0 0.235 -0.041 2.8% 3.080 | ***** 2.0 0.241 -0.052 3.0% 3.028 | ***** 4.0 0.247 -0.066 3.2% 2.966 | **** 6.0 0.251 -0.081 3.4% 2.894 | **** 8.0 0.255 -0.100 3.6% 2.812 | **** 10.0 0.256 -0.122 3.7% 2.721 | *** 12.0 0.255 -0.146 3.9% 2.620 | *** 14.0 0.251 -0.170 3.9% 2.511 | *** 16.0 0.245 -0.197 4.0% 2.396 | ** 18.0 0.236 -0.224 4.1% 2.275 | ** 20.0 0.225 -0.249 4.1% 2.149 | * 22.0 0.212 -0.267 4.1% 2.016 | * 24.0 0.197 -0.279 3.9% 1.876 | 26.0 0.181 -0.284 3.7% 1.728 | 28.0 0.163 -0.280 3.3% 1.574 | 30.0 0.145 -0.271 2.9% 1.416 | 32.0 0.127 -0.255 2.4% 1.261 | 34.0 0.109 -0.235 2.0% 1.113 | 36.0 0.091 -0.213 1.6% 0.978 | 38.0 0.074 -0.192 1.2% 0.860 | 40.0 0.059 -0.172 0.9% 0.759 | 42.0 0.046 -0.153 0.7% 0.676 | ---------------------------------------|
SIMULATED_DATA
WinQTL
Linear modelszik = + bi + eik
kth individual of marker genotype i
Linear models
QQ = + a Qq = + d qq = - a
Linear models
QQ = + a Qq = + d qq = - a
zj = + a . x (Mj) + d . y (Mj) + ej
Linear models
QQ = + a Qq = + d qq = - a
zj = + a . x (Mj) + d . y (Mj) + ej
x (Mj) = p(QQ | Mj) – p (qq| Mj)
y (Mj) = p(Qq | Mj)
Linear modelsx (Mj) = p(QQ | Mj) – p (qq| Mj)
x(M1M1M2M2)(1-r1) 2(1-r2)2 -(r1
2r22)
(1-r12)2=
Linear modelsx (Mj) = p(QQ | Mj) – p (qq| Mj)
x(M1M1M2M2)(1-r1) 2(1-r2)2 -(r1
2r22)
(1-r12)2=
2r1r2(1-r1) (1-r2)y(M1M1M2M2) (1-r12)2=
y (Mj) = p(Qq | Mj)
Significance test
LR = n ln (SST/SSE) = -n ln (1-r2)
Degrees of freedom are the number of estimated QTL parameters, plus one for the map position
Matrix statement of Haley Knott regression
r1 = (XTr1 Xr1) -1 XT
r1 z
ith row of matrix Xr1: (1,x(Mi,r1), y(Mi,r1))
Example
Regression example.xls
0 0 0.15 0.15 0.3 0.3Marker Genotypes Phenotypes x y x y x yM1M1M2M2 5.6 1 0 0.91 0.9 1 0M1M1M2M2 5.4 1 0 0.91 0.9 1 0M1M1M2m2 5.3 1 0 0.56 0.4 0 1M1m1M2m2 3.9 0 1 0 0.85 0 1M1m1M2m2 3.3 0 1 0 0.85 0 1M1m1M2M2 3.6 0 1 0.35 0.6 1 0M1m1M2M2 3.7 0 1 0.35 0.6 1 0m1m1M2m2 3.9 -1 0 -0.56 0.4 0 1m1m1M2m2 3.5 -1 0 -0.56 0.4 0 1m1m1m2m2 1.1 -1 0 -0.91 0.9 -1 0m1m1m2m2 0.8 -1 0 -0.91 0.9 -1 0
Problems of QTL detection
• Linked QTLs corrupt the position estimates
• Unlinked QTLs decreases the power of QTL detection
Extensions to linear regression
• Composite interval mapping
• Multiple interval mapping
Composite interval mapping
ZB Zeng Precision mapping of quantitative trait lociGenetics, Vol 136, 1457-1468, 1994
http://statgen.ncsu.edu/qtlcart/cartographer.html
Composite interval mapping
Composite interval mapping
M1 M2
M1 M2QQ Q
Composite interval mapping
M-1 M1 M2 M3
M-1 M1 M2 M3QQ Q
Composite interval mapping
M-1 M1 M2 M3
M-1 M1 M2 M3QQ Q
zj = + a . x (Mj) + d . y (Mj)
+ k=i, i+1
bk . xkj + ej
Example
SIMULATED_DATA
WinQTL
Multiple Interval Mapping
Multiple Interval Mapping
Multiple Interval Mapping
Example?