Upload
rainer
View
58
Download
0
Tags:
Embed Size (px)
DESCRIPTION
y. β o. 0. x. -1. aa. AA. Genotypic classes. Basic QTL Analysis Is there an association between marker genotype and quantitative trait phenotype? - Classify progeny by marker genotype - Compare phenotypic mean between classes (t-test or ANOVA) - PowerPoint PPT Presentation
Citation preview
Basic QTL Analysis
Is there an association between marker genotype and quantitative trait phenotype? - Classify progeny by marker genotype - Compare phenotypic mean between classes (t-test or ANOVA) - Significance = marker linked to QTL - Difference between means = estimate of QTL effect
g = (µ1 - µ2)/2
g = genotypic effect
µ1 = trait mean for genotypic class AA
µ2 = trait mean for genotypic class aa
0aa AA
Genotypic classes
βo
-1 x
y
Notations for single-QTL models in backcross and F2 populations
Model Genotype ValueBackcross (Qq x QQ) QQ µ1
Qq µ2
Genetic effect g = 0.5(µ1 - µ2)
DH (qq x QQ) QQ µ1
Qq µ2
Genetic effect g = 0.5(µ1 - µ2)
F2 (Qq x Qq)
QQ µ1
Qq µ2
qq µ3
Additive a = 0.5(µ1 - µ3)
Dominance d = 0.5(2µ2 - µ1 - µ3)
Single-marker analysis• How it works
– Finds associations between marker genotype and trait value
• When to use– Order of markers unknown or incomplete maps– Quick scan
– Find best possible QTLs– Identify missing or incorrectly formatted
data
• LimitationsUnderestimates QTL number and effects
QTL position can not be precisely determined
A(marker)
Q(putative QTL)
r
r = recombination fraction
yj = trait value for the jth individual in the population
μ = population mean
f(A) = function of marker genotype
εj = residual associated with the jth individual
jj Afy )(
Single-marker analysis in backcross progeny
• Parents: AAQQ x aaqq
• Backcross: aaqq x AaQq x AAQQ
Expected
Frequency
• BC Progeny AaQq AAQQ 0.5 (1 - r)
Aaqq AAQq 0.5r
aaQq AaQQ 0.5r
aaqq AaQq 0.5(1 - r)
r is recombination frequency between A and Q
Expected QTL genotypic frequencies conditional on genotypes
Marker genotype
Observed count
Marginal frequencies
QTL genotype Expected trait value
QQ Qq
Joint frequency
AA n1 0.5 0.5(1-r) 0.5r
Aa n2 0.5 0.5r 0.5(1-r)
Conditional frequency
AA n1 0.5 1-r r (1-r)µ1 + rµ2
Aa n2 0.5 r 1-r rµ1 + (1-r)µ2
Single-marker analysis
- Simple t-test- Analysis of variance- Linear regression- Likelihood
A(marker)
Q(putative QTL)
r
Simple t-test using backcross progeny
Yj(i)k = μ + Mi + g(M)j(i) + ei(j)k
21
2 11ˆ
ˆˆ
nns
t
M
aaAaM
H0: [μAa - μaa ] = 0(a + d) = 0
r = 0.5
t-distribution with df = N – 2
If tM is significant, then a QTL is declared to be near the marker
Yj(i)k = trait value for individual j with genotype i in the replication kμ = population mean Mi = effect of the marker genotypeg(M)j(i) = genotypic effect which cannot be explained by the marker genotypeei(j)k = error termµAa = trait mean for genotypic class Aaµaa = trait mean for genotypic class aas2
M = pooled variance within the two classes
2
2
1
2 ˆˆ
ˆˆ
ns
ns
taaAa
aaAaM
Analysis of variance using backcross progenyH0: [μAa - μaa ] = 0
(a + d) = 0
r = 0.5
Source df MS (Mean Square)
Expected MS
Total Genetics N - 1 MSG
Marker 1 MSMG(Marker) N - 2 MSG(M)
Residual N (b - 1) MSE 2e
2)(
22 )1(4 arrb QTLGe
222)(
22 )21()1(4 arbcarrb QTLGe
22Ge b
)(MMSGMSMF
F-distribution with 1 and N – 2 df
If F is significant, then a QTL is declared to be near the marker
F = t if df for numerator is 1
N= no. of individuals in pop.b = no. of replicationsr = recombination fraction
Analysis of variance using SAS
data a;input Individuals Trait1 Marker1 Marker2;cards; 1 1.57 A B 2 1.35 B A 3 10.7 B B…proc glm;class Marker1 Marker2;model Trait1 = Marker1 Marker2;lsmeans Marker1 Marker2;run;
(A simple example)
0aa Aa
Genotypic classes
βo
-1 x
y
Linear regression using backcross progeny
jj jxy 10
H0: [μAa - μaa ] = 0(a + d) = 0
r = 0.5
Dummy variables:
aa = -1
Aa = 1
yj= trait value for the jth individual
xj= dummy variable
βo= intercept for the regression
β1= slope for the regression
j= random errorExpectations:
E(βo) = 0.5 (µAa + µaa) = Mean for the trait
E(β1) = 0.5 (1 - 2r) (µAa - µaa) = (1 - 2r) g = 0.5 (a + d) (1 - 2r)
β1
R2: percent of the phenotypic variance explained by the QTL
y = 3 + x + e
0
1
2
3
4
5
6
-2 -1 0 1 2
y = 3 - x + e
0
1
2
3
4
5
6
-2 -1 0 1 2
Linear regression using backcross progeny
Interpretation of results depends on coding of the dummy variables
y y
x x
Genotypic classes Genotypic classesaa Aa aa Aa
µ = 3µAa = 4µaa = 2g = 0.5(µAa - µaa) = 1
µ = 3µAa = 2µaa = 4g = 0.5(µAa - µaa) = -1
A likelihood approach using backcross progeny
N
i j
jiijN
yMQpL
1
2
12
2
2)(
exp)/(2
1
Joint distribution function:
A likelihood approach using backcross progeny (cont.)
)2(22
)(exp)/(,,,( 2
1
2
12
22
21
LnNyMQpLnrLLn
N
i j
jiij
)2(2
)(2
1( 2
1
2221
LnNyLLn
N
ii
)2(22
)(2
)(exp)5.0( 2
12
22
2
21
LnNyyLnrLLn
N
i
ii
A likelihood approach using backcross progeny (cont.)
H0: [μAa - μaa ] = 0
(a + d) = 0
r = 0.5
)5.0(ln)ˆ,ˆ,ˆ,ˆ(ln2 2 rLrLG aaAa G is distributed asymptotically as a chi-square variable with one degree of freedom
)(ln)ˆ,ˆ,ˆ,ˆ(ln2 2 aaAaaaAa LrLG
The t-test is approximately equivalent to the likelihood ratio test using this formula
G-statistics
Likelihood ratio test statistics (LR)Probability of occurrence of the data under the
null hypothesis
(Weller, 1986)
LOD scoreLOD : Logarithm of the odds ratio
Base 10 logarithm of GLR= 2 (log)LOD = 4.605LOD LOD= 0.217LR
LOD is interpreted as an odds ratio
(probability of observing the data under linkage/probability of observing the same data under no linkage)
No theoretical distribution is needed to interpret a lOD score
Key value: ≥ 3 (H1 is 1000 times more likely than H0 -no linkage-)
(approx: p = 0.001) p= probability of type I errorType I error: false positive (declare a QTL when there is no QTL)
G-Statistics and LOD score
Single-marker analysis Summary
• Identify marker-trait associations• Identify missing or incorrectly formatted data• Genetic map is not required• Divide the population into subpopulations based on the allelic
segregation of individual loci (one marker at a time)• Get trait means for each subpopulation (genotypic class)• Determine if the subpopulations trait means are significantly
different
• LimitationsUnderestimates QTL number and effects
QTL position can not be precisely determined