Upload
arty
View
36
Download
3
Embed Size (px)
DESCRIPTION
A Transmission/disequilibrium Test for Ordinal Traits in Nuclear Families and a Unified Approach for Association Studies. Heping Zhang, Xueqin Wang and Yuanqing Ye Department of Epidemiology and Public Health Yale University Presented at Workshop on Genomics, NUS November 14, 2005. Outline. - PowerPoint PPT Presentation
Citation preview
A Transmission/disequilibrium Test for Ordinal Traits in Nuclear Families and a Unified Approach for Association Studies
A Transmission/disequilibrium Test for Ordinal Traits in Nuclear Families and a Unified Approach for Association Studies
Heping Zhang, Xueqin Wang and Yuanqing Ye
Department of Epidemiology and Public Health
Yale University
Presented at Workshop on Genomics, NUS
November 14, 2005
November 14, 2005 2
OutlineOutline
• Data structure
• TDT, Q-TDT, S-TDT, etc.
• O-TDT for ordinal traits
• Simulations
• Data analysis
• Discussion and conclusion
November 14, 2005 3
Data StructureData Structure
… …
n families
siblings insiblings 1n siblings nn
November 14, 2005 4
Linkage Analysis – Null HypothesisLinkage Analysis – Null Hypothesis
To test for linkage, the null hypothesis is that the marker locus is not linked to any trait locus.
Marker
Trait locus
Linked Unlinked
November 14, 2005 5
Linkage Analysis - Recombination FractionLinkage Analysis - Recombination Fraction
Marker
Trait locus
2
1 2
1
fractionion recombinat
thebe Let 2
2
November 14, 2005 6
Coefficient of Linkage DisequilibriumCoefficient of Linkage Disequilibrium
Marker
Trait locus
d
d1
m
m1
Allele Frequency Haplotype Frequency?
)1( md
md )1(
)1)(1( md
dm
Freq( , ) - Freq( )Freq( )=
November 14, 2005 7
Null Hypothesis – Linkage DisequilibriumNull Hypothesis – Linkage Disequilibrium
0)2-(1
TDT is to test for linkage in presence of association
or test for association in presence of linkage
(Spielman et al. 1993; Ewens and Spielman 1995).
The null hypothesis of haplotype relative risk (Falk
and Rubinstein, 1987) being 1 is:
November 14, 2005 8
Transmission/Disequilibrium Test (TDT)Transmission/Disequilibrium Test (TDT)
• Eliminate the confounding effects caused by
population stratification/admixture, and other
factors
• A McNemar’s test
November 14, 2005 9
TDT-McNemar TestTDT-McNemar TestSuppose two heterozygous parents and an affected child are genotyped.
A Aa a
Father Mother
marker
Trans Nontrans Father Mother
Aax
AaxAax
Aax
November 14, 2005 10
TDT-McNemar TestTDT-McNemar TestSuppose two heterozygous parents and an affected child are genotyped.
A Aa a
Father Mother
marker
Trans Nontrans FatherAax Nontrans
A aTrans A a
0 10 0
November 14, 2005 11
TDT-McNemar TestTDT-McNemar TestSuppose two heterozygous parents and an affected child are genotyped.
A Aa a
Father Mother
marker
Trans Nontrans Mother
AaxNontransA aTrans
A a
0 01 0
November 14, 2005 12
TDT-McNemar TestTDT-McNemar TestSuppose two heterozygous parents and an affected child are genotyped.
A Aa a
Father Mother
marker
Trans Nontrans Trans Nontrans
NontransA aTrans
A a
0 11 0
AaxAax
November 14, 2005 13
TDT-McNemar TestTDT-McNemar Test
Nontransmitted
Transmitted A a Total
A
a
Total
11n 12n
21n 22n
2111 nn 2212 nn
1211 nn
2221 nn
)/()( 21122
21122 nnnnc
n2
Combinations of Transmitted and Nontransmitted Marker Alleles A and a among 2n Parents of n Affected Children
November 14, 2005 14
Further DevelopmentsFurther Developments
• Q-TDT proposed by Allison (1997)
• Q-TDT further investigated by Rabinowitz
(1997)
• S-TDT (Spielman and Ewens 1998)
• FBAT (Lunetta et al. 2000; Rabinowitz and
Laird 2000)
• Many other extensions
November 14, 2005 15
General Test StatisticGeneral Test Statistic
Assume that there are n nuclear families. In the
family, there are siblings, i=1,…, n. For the
child in the family, the trait value is and the
genotype is . is the number of allele A in the
genotype . The linkage/association test statistic
can be constructed as follows:
where is a weight of the phenotype .
thi
in
ijythi
thj
ijgijX
ijg
n
i
n
jijij
n
ii
i
XWTT1 11
,
ijW ijy
November 14, 2005 16
ExampleExample
• For a sample of affected child-parent triads, let
then is the TDT
introduced by Spielman et al. (1993).
• For a sample of nuclear families with quantitative
trait values , let , where is the average
of trait values, then is the
Q-TDT introduced by Rabinowitz (1997)
• For ordinal trait?
,1ijW
ijy yyW ijij y
)|(/))|(( 2 YTVarYTET
)|(/))|(( 2 YTVarYTET
November 14, 2005 17
TDT for Ordinal TraitsTDT for Ordinal Traits
Let be the count of children whose trait values greater
or less than y and , the test statistic for
ordinal traits is
Under the null hypothesis, follows
.
yR
ijij yyij RRW
n
i
n
jijyy
i
ijijXRRT
1 1
)(
)(/))(( 2 TVarTET 21
November 14, 2005 18
Model and MethodModel and Method
,1,...,1 ),())|((logit :Model KkgIgkYP k
. genotypein allele of copies ofnumber theis )(
effect. genetic is and ,parameters level are ' where
gDgI
sk
• Di-allelic maker with possible alleles A and a.
• Assume that there is a trait increasing allele , and
we use to denote the wild type allele(s)
•Consider a trait taking values in ordinal responses
1,…, K.
November 14, 2005 19
Two Common AssumptionsTwo Common Assumptions
• The trait and marker loci are closely linked
such that, given the family’s genotypes at a trait
locus , the family’s phenotypes and marker
genotypes are independent;
• Given disease genotypes, the traits of the
family members are conditionally independent.
November 14, 2005 20
Conditional LikelihoodConditional Likelihood
}.|{}{
}|{}{}|{
:familyth in the phenotypes observed given the
genotypes ssibling' for the likelihood lconditiona The
ig i
iiii MgP
yP
gyPMPyMP
i
}){2()]1()(1[}){1}({
|})|{log( 0
APXyyAPAP
yMP
ijj
ijij
ii
The score function
November 14, 2005 21
Score StatisticScore Statistic
.)(1 1
n
i
n
jijyy
i
ijijXRRT
After plugging in the estimates for the nuisance parameters, the score function under the null hypothesis is , where)|( YTET
November 14, 2005 22
Expectation and VarianceExpectation and Variance
Following the idea of Rabinowitz and laird (2000),
we can compute or estimate the conditional
expectation and the conditional variance given
the observed trait values under null hypothesis in
the following three cases:
(a) both parental marker information is available;
(b) only one of parental marker information is
available; and
(c) none of parental marker information is available.
November 14, 2005 23
Expectation and VarianceExpectation and Variance
)},()]()()][()([2
}{)]()([{}|{
}{)]()([}|{
2
,
ikijjk
ikikijij
i jijijij
ijji
ijij
XXCovyRyRyRyR
XVaryRyRYTVar
XEyRyRYTE
November 14, 2005 24
Both Parents GenotypedBoth Parents GenotypedWhen both parents’ genotypes are observed, the children’s genotypes are conditionally independent.
Parental Genotypes Expectation Variance
(AA, AA) 2 0
(AA, Aa) 3/2 1/4
(AA, aa) 1 0
(Aa, Aa) 1 1/2
(Aa, aa) 1/2 1/4
(aa, aa) 0 0
November 14, 2005 25
One Parent GenotypedOne Parent GenotypedParental
Genotype
Children’s Possible
Genotypes
Cond. Probability Joint Conditional Genotype Distribution of
Two SibsAA Aa aa
AA {AA} 1 P{AA, AA}=1
{Aa} 1 P{Aa, Aa}=1
{AA, Aa} 1/2 1/2 P{AA, Aa}=
P{AA, AA}=P(Aa, Aa}=
aa {Aa} 1 P{Aa, Aa}=1
{aa} 1 P{aa, aa}=1
{Aa, aa} P{AA, Aa}=
P{Aa, Aa}=P{aa, aa}=
Aa {AA} 1 P{AA, AA}=1
{Aa} 1 P{Aa, Aa}=1
{aa} 1 P{aa, aa}=1
{AA, Aa} P{AA, AA}=
P(Aa, Aa}=
P{AA, Aa}=
)22/(2 2 nn
)22/()12( 2 nn
)22/(2 2 nn
)22/()12( 2 nn
nnAA / nnAa / )1(/)1( nnnn AAAA
)1(/)1( nnnn AaAa)1(/ nnnn AaAA
November 14, 2005 26
One Parent Genotyped (continued)One Parent Genotyped (continued)Parental
Genotype
Children’s
Possible
Genotypes
Cond. Probability Joint Conditional Genotype
Distribution of Two SibsAA Aa aa
Aa {Aa, aa} P{Aa, Aa}=
P(aa, aa}=
P{Aa, aa}=
{AA,aa} P{AA, AA}=P{aa, aa}=
P{AA, Aa}/2= P{Aa, aa}/2
=
P{AA, aa}=
P{Aa, Aa}=
{AA, Aa,
aa}
nnaa /nnAa /
nnn
nn
2324
34 11
nnn
nnn
2324
23442 11
nnn
nn
2324
34 11
)2324/()34( 22 nnnnn
nnn
n
2324
4 2
nnn
nnn
2324
2384 21
ly.respective aa, and Aa, AA,
genotypeith children w ofnumber theis and ,, aaAaAA nnn
)1(/ nnnn aaAa
)1(/)1( nnnn aaaa
)1(/)1( nnnn AaAa
November 14, 2005 27
No Parental GenotypeNo Parental GenotypeChildren’s
Possible
Genotypes
Cond. Probability Joint Conditional Genotype Distribution
of Two SibsAA Aa aa
{AA} 1 P{AA, AA}=1
{Aa} 1 P{Aa, Aa}=1
{aa} 1 P{aa, aa}=1
{AA, Aa} P{AA, AA}=
P(Aa, Aa}=
P{AA, Aa}=
{Aa, aa} P{Aa, Aa}=
P(aa, aa}=
P{Aa, aa}=
{AA,aa} P{AA,AA}=P{aa,aa}=P{AA,Aa}/2=
P{Aa, aa}/2=
P{AA, aa}=
P{Aa, Aa}=
{AA, Aa, aa}
nnAA / nnAa /
)1(/ nnnn AaAA
nnAa / nnaa /)1(/ nnnn aaAa
nnn
nn
2324
34 11
nnn
nnn
2324
23442 11
)2324/()34( 11 nnnnn )2324/()34( 22 nnnnn
nnn
n
2324
4 2
)2324/(2384( )21 nnnnnn
)1(/)1( nnnn AAAA
)1(/)1( nnnn AaAa
)1(/)1( nnnn AaAa
)1(/)1( nnnn aaaa
November 14, 2005 28
Simulation StudiesSimulation Studies
• Assess the type I error of our score test with respect
to specific nominal levels (0.05, 0.01, and 0.0001) to
validate the asymptotic behavior of the test statistic.
• Compare the power of our test with other test
statistics.
• Choose the ordinal level K=3, 4, or 5.
November 14, 2005 29
Simulation DesignSimulation Design
• Generate the parent’s genotypes for given the haplotype frequencies
11.0
Haplotype Frequency
AD 0.2
Ad 0.1
aD 0.1
ad 0.6
November 14, 2005 30
Simulation DesignSimulation Design
• Given the parental genotypes, generate the offspring genotypes assuming unlinked (null) or linked (1cM, alternative) trait and marker loci
• Conditional on the trait genotype, use the proportional odds model to generate the ordinal trait.
• 200 or 400 families are generated
November 14, 2005 31
Three models to generated trait valuesThree models to generated trait values
(a) A proportional odds model is used to
generate an ordinal trait;
(b) A non-proportional odds model is also used
to generate an ordinal trait to assess the
robustness of our score test with respect to
the proportionality assumption;
(c) A Gaussian model is used to generate a
quantitative trait to evaluate the performance
of O-TDT for the quantitative trait.
November 14, 2005 32
1,...,1for
,1 and 2, and 1, 0, are of valuesPossible
alleles.
increasing trait of copies ofnumber theis where
,1,...,1,)]|([logit
model odds
alproportion by the generated are traitsordinal The
Kk
kt
t
KkttkYP
k
k
Ordinal Traits Generated from a Proportional Odds Model (a)Ordinal Traits Generated from a Proportional Odds Model (a)
November 14, 2005 33
05.0 01.0
Type I Errors Based on 10,000 Replications (a)Type I Errors Based on 10,000 Replications (a)
#of
families
K
Q-TDT O-TDT TDT Q-TDT O-TDT TDT Q-TDT O-TDT TDT
200 3 0.0509
0.0506 0.0508 0.0100 0.0099 0.0095 6e-005 8e-005 9e-005
4 0.0506 0.0501 0.0508 0.0099 0.0101 0.0091 7e-005 6e-005 9e-005
5 0.0509 0.0513 0.0512 0.0102 0.0099 0.0098 6e-005 0.0001 7e-005
400 3 0.0519 0.0518 0.0507 0.0106 0.0107 0.0098 0.0001 0.0001 8e-005
4 0.0508 0.0500 0.0500 0.0105 0.0103 0.0098 0.0001 9e-005 0.0001
5 0.0520 0.0507 0.0502 0.0107 0.0105 0.0098 9e-005 0.0001 7e-005
0001.0
November 14, 2005 34
Figure: Power comparison (a)Figure: Power comparison (a)
November 14, 2005 35
K=3 P(Y1|dd)=.7
P(Y2|dd)=.9
P(Y1|dD)=.3
P(Y2|dD)=.6
P(Y1|DD)=.1
P(Y2|DD)=.5
P(Y=1)=0.478
P(Y=2)=0.260
P(Y=3)=0.262
K=4 P(Y1|dd)=.7
P(Y2|dd)=.8
P(Y3|dd)=.9
P(Y1|dD)=.3
P(Y2 |dD)=.5
P(Y3 |dD)=.7
P(Y1|DD)=.1
P(Y2 |DD)=.35
P(Y3 |DD)=.6
P(Y=1)=0.478
P(Y=2)=0.155
P(Y=3)=0.156
P(Y=4)=0.211
K=5 P(Y1|dd)=.7
P(Y2|dd)=.77
P(Y3|dd)=.85
P(Y4|dd)=.92
P(Y1 |dD)=.2
P(Y2 |dD)=.45
P(Y3 |dD)=.65
P(Y4 |dD)=.8
P(Y1 |DD)=.05
P(Y2 |DD)=.35
P(Y3 |DD)=.55
P(Y4 |DD)=.75
P(Y=1)=0.431
P(Y=2)=0.166
P(Y=3)=0.141
P(Y=4)=0.115
P(Y=5)=0.146
Non-Proportional Odds Model (b)Non-Proportional Odds Model (b)
Conditional and marginal distribution for ordinal trait
November 14, 2005 36
05.0 01.0
Type I Errors Based on 10,000 Replications (b)Type I Errors Based on 10,000 Replications (b)
#of
families
K
Q-TDT O-TDT TDT Q-TDT O-TDT TDT Q-TDT O-TDT TDT
200 3 0.0492 0.0493 0.0495 0.0098 0.0097 0.0095 7e-005 7e-005 7e-005
4 0.0497
0.0499 0.0495 0.0096 0.0097 0.0095 4e-005 7e-005 7e-005
5 0.0497 0.0504 0.0485 0.0105 0.0106 0.0093 5e-005 7e-005 5e-005
400 3 0.0503 0.0501 0.0481 0.0097 0.0097 0.0093 0.0001 0.0001 0.0001
4 0.0509 0.0504 0.0481 0.0096 0.0099 0.0093 8e-005 9e-005 0.0001
5 0.0514 0.0508 0.0475 0.0096 0.0094 0.0090 0.0001 9e-005 0.0001
0001.0
November 14, 2005 37
Figure: Power comparison (b)Figure: Power comparison (b)
November 14, 2005 38
Performance for Quantitative Traits (c)Performance for Quantitative Traits (c)
Our test can serve as a unified test for any trait. For
quantitative trait, the weights in our test are the
functions of quantiles. Simulations show that our
test is competitive with, but slightly less powerful
than Q-TDT.
November 14, 2005 39
Type I Errors for Quantitative Traits Based on 100,000 Replications (c)Type I Errors for Quantitative Traits Based on 100,000 Replications (c)
# of
FamilyQ-TDT O-TDT Q-TDT O-TDT Q-TDT O-TDT
200 0.05407 0.05268 0.01041 0.01015 0.00013 8e-005
400 0.0538 0.05206 0.01099 0.01040 0.00013 0.00012
05.0 01.0 01.0
November 14, 2005 40
Power: Quantitative Trait Power: Quantitative Trait
2.= and 1,= 0,= DDDddd
Data are simulated similarly to the experiments for assessing type I error, except the following. Given the genotype at the trait locus, the quantitative trait follows the normal distribution with mean proportional to the number of the trait increasing allele and unit variance. Namely,
November 14, 2005 41
Figure: Power comparison (c)Figure: Power comparison (c)
November 14, 2005 42
Data (Dr. Ming Li) Data (Dr. Ming Li)
• Identify candidate SNPs through association
analysis
• Nicotine dependence was measured in 313
families with 1,396 subjects. 12 SNPs were
genotyped for GPR51 gene (suggested from
Framingham Heart Study samples).
• One ordinal trait with 8 levels was assessed by
Fagerstrom test for nicotine dependence (FTND)
• FBAT was also used for comparison
November 14, 2005 43
FTNDFTND
1. How many cigarettes a day do you usually smoke? (0-3 points)
2. How soon after you wake up do you smoke your first cigarette? (0-3 points)
3. Do you smoke more during the first two hours of the day than during the rest of the day? (0,1)
4. Which cigarette would you most hate to give up? (0,1)
5. Do you find it difficult to refrain from smoking in places where it is forbidden, such as public buildings, on airplanes or at work? (0,1)
6. Do you still smoke even when you are so ill that you are in bed most of the day? (0,1)
TOTAL POINTS =
November 14, 2005 44
GPR51 GeneGPR51 Gene
• G protein-coupled receptor 51 (on 5q24 on rat genome and 9p22.33 on human genome)• Combines with GABA-B1 to form functional GABA-B receptors• Inhibits high voltage activated calcium ion channels
November 14, 2005 45
ResultsResults
SNP ID PBAT-GEE O-TDT Q-TDT
Pooled AA EA Pooled AA EA Pooled AA EA
rs2304389 .03 .81 .14 .85 .84 .96 .89 .97 .86
rs1435252 .13 .09 .004 .50 .67 .037 .68 .53 .059
rs3780422 .21 .49 .04 .26 .29 .67 .29 .29 .72
rs1537959 .07 .02 .38 .48 .24 .65 .72 .43 .60
rs2491397 .07 .11 .12 .23 .82 .044 .36 .98 .062
rs2779562 .38 .44 .008 .76 .21 .012 .49 .32 .0048
rs3750344 .35 .74 .006 .13 .49 .026 .17 .69 .013
November 14, 2005 46
Discussion and ConclusionDiscussion and Conclusion
• We propose a score test statistic for Linkage analysis.• Although it is derived from a proportional odds model for ordinal traits, power comparisons reveal that it can serve as a unified approach for dichotomous, quantitative, and ordinal traits.• The score based Q-TDT test yields lower power than O-TDT for ordinal traits, but the difference ranges from a few to tens of percents, depending on the distribution of the ordinal traits.