of 51 /51
1 Univariate Twin Analysis OpenMx Tc26 2012 Hermine Maes, Elizabeth Prom-Wormley, Nick Martin

# Univariate Twin Analysis

sammy

• View
46

0

Embed Size (px)

DESCRIPTION

Univariate Twin Analysis. OpenMx Tc26 2012 Hermine Maes, Elizabeth Prom-Wormley, Nick Martin. Overall Questions to be Answered. Does a trait of interest cluster among related individuals? Can clustering be explained by genetic or environmental effects? - PowerPoint PPT Presentation

Citation preview

1

Univariate Twin Analysis

OpenMx Tc26 2012Hermine Maes, Elizabeth Prom-Wormley, Nick Martin

2

•Does a trait of interest cluster among related individuals?

•Can clustering be explained by genetic or environmental effects?

•What is the best way to explain the degree to which genetic and environmental effects affect a trait?

3

Family & Twin Study Designs

•Family Studies

•Classical Twin Studies

•Extended Twin Studies

4

The Data

•Australian Twin Register

•18-30 years old, males and females

•Work from this session will focus on Body Mass Index (weight/height2) in females only

•Sample size

•MZF = 534 complete pairs (zyg = 1)

•DZF = 328 complete pairs (zyg = 3)

5

A Quick Look at the Data

6

Classical Twin Studies Basic Background

•The Classical Twin Study (CTS) uses MZ and DZ twins reared together

•MZ twins share 100% of their genes

•DZ twins share on average 50% of their genes

•Expectation- Genetic factors are assumed to contribute to a phenotype when MZ twins are more similar than DZ twins

7

Classical Twin Study

Assumptions

•MZ twins are genetically identical

•Equal Environments of MZ and DZ pairs

8

Basic Data Assumptions

•MZ and DZ twins are sampled from the same population, therefore we expect :-

•Equal means/variances in Twin 1 and Twin 2

•Equal means/variances in MZ and DZ twins

•Further assumptions would need to be tested if we introduce male twins and opposite sex twin pairs

9

“Old Fashioned” Data Checking

MZ DZT1 T2 T1 T2

mean21.3

521.3

421.4

521.4

6variance 0.73 0.79 0.77 0.82covariance(T1-

T2)0.59 0.25

Nice, but how can we actually be sure that these means and variances are truly the same?

10

Univariate AnalysisA Roadmap1- Use the data to test basic assumptions

(equal means & variances for twin 1/twin 2 and MZ/DZ pairs)

Saturated Model

2- Estimate contributions of genetic and environmental effects on the total variance of a phenotype

3- Test ACE (ADE) submodels to identify and report significant genetic and environmental contributions

AE or CE or E Only Models10

11

Saturated Twin Model

12

Saturated Code Deconstructed

mMZ1 mMZ2 mDZ1 mDZ2mean MZ = 1 x 2

matrixmean DZ = 1 x 2

matrix

13

Saturated Code Deconstructed

vMZ1 cMZ21

cMZ21 vMZ2

T1 T2

T1

T2

vDZ1 cDZ21

cDZ21 vDZ2

T1 T2

T1

T2

covMZ = 2 x 2 matrix

covDZ = 2 x 2 matrix

14

Time to Play... with OpenMx

Estimated ValuesT1 T2 T1 T2Saturated Model

mean MZ DZcov T1 T1

T2 T210 Total Parameters Estimated

Standardize covariance matrices for twin pair correlations (covMZ & covDZ)

mMZ1, mMZ2, vMZ1,vMZ2,cMZ21mDZ1, mDZ2, vDZ1,vDZ2,cDZ21

16

Estimated Values

10 Total Parameters Estimated

Standardize covariance matrices for twin pair correlations (covMZ & covDZ)

mMZ1, mMZ2, vMZ1,vMZ2,cMZ21mDZ1, mDZ2, vDZ1,vDZ2,cDZ21

T1 T2 T1 T2Saturated Model

mean MZ 21.34 21.35 DZ 21.45 21.46cov T1 0.73 T1 0.77

T2 0.59 0.79 T2 0.24 0.82

17

Fitting Nested Models

•Saturated Model• likelihood of data without any constraints

• fitting as many means and (co)variances as possible

•Equality of means & variances by twin order• test if mean of twin 1 = mean of twin 2

• test if variance of twin 1 = variance of twin 2

•Equality of means & variances by zygosity• test if mean of MZ = mean of DZ

• test if variance of MZ = variance of DZ

Estimated Values

T1 T2 T1 T2Equate Means & Variances across Twin

Ordermean MZ DZcov T1 T1

T2 T2Equate Means Variances across Twin Order

& Zygositymean MZ DZcov T1 T1

T2 T2

19

Estimates

T1 T2 T1 T2Equate Means & Variances across Twin

Ordermean MZ 21.35 21.35 DZ 21.45 21.45cov T1 0.76 T1 0.79

T2 0.59 0.76 T2 0.24 0.79Equate Means Variances across Twin Order

& Zygositymean MZ 21.39 21.39 DZ 21.39 21.39cov T1 0.78 T1 0.78

T2 0.61 0.78 T2 0.23 0.78

20

StatsModel ep -2ll df AIC

diff -2ll

diffdf

p

Saturated 104055.9

3176

7521.9

3

mT1=mT2 8 4056176

9518 0.07 2

0.97

mT1=mT2 &varT1=VarT2 6

4058.94

1771

516.94

3.01 40.56

ZygMZ=DZ 4

4063.45

1773

517.45

7.52 60.28

No significant differences between saturated model and models where means/variances/covariances are equal by zygosity and between twins

21

Questions?

This is all great, but we still don’t understand the genetic and

environmental contributions to BMI

22

Patterns of Twin Correlation

DZ twins on average

effects

rMZ = rDZShared

Environment

rMZ > 2rDZ

e

DZ twins on

average share 25%

of dominance effects

???

A = 2(rMZ-rDZ)C = 2rDZ - rMZ E = 1- rMZ

23

BMI Twin Correlations

0.30

0.78

A = 2(rMZ-rDZ)

C = 2rDZ - rMZ

E = 1- rMZA = 0.96

C = -0.20

E = 0.22

24

1- Use the data to test basic assumptions inherent to standard ACE (ADE) models

Saturated Model

2- Estimate contributions of genetic and environmental effects on the total variance of a phenotype

3- Test ADE (ACE) submodels to identify and report significant genetic and environmental contributions

AE or E Only Models

25

•Dominance refers to non-additive genetic effects resulting from interactions between alleles at the same locus

•ADE models also include effects of interactions between different loci (epistasis)

•In addition to sharing on average 50% of their DNA, DZ twins also share about 25% of effects due to dominance

26

a1 x 1 matrix

d1 x 1 matrix

e1 x 1 matrix

covA <- mxAlgebra( expression=a %*% t(a), name="A" )

covD <- mxAlgebra( expression=d %*% t(d), name=“D" )

covE <- mxAlgebra( expression=e %*% t(e), name="E" )

27

a1 x 1 matrix

d1 x 1 matrix

e1 x 1 matrix

aT

eT

*

*

*

dT

28

expMean

expMean

expMean= 1 x 2 matrix

A+D+E

A+D+E

T1T2

T2

T1

covMZ & covDZ= 2 x 2 matrix

29

A+D+E

0.5A A+D+E

T1T2

T2

T1

A+D+E

A A+D+E

T1T2

T2

T1

30

A+D+E0.5A+0.25

D

0.5A+0.25D

A+D+E

T1T2

T2

T1

A+D+E A+D

A+D A+D+E

T1T2

T2

T1

31

4 Parameters Estimated

ExpMeanVariance due to AVariance due to DVariance due to E

32

1- Use the data to test basic assumptions inherent to standard ACE (ADE) models

Saturated Model

2- Estimate contributions of genetic and environmental effects on the total variance of a phenotype

3- Test ADE (ACE) submodels to identify and report significant genetic and environmental contributions

AE or E Only Models

33

Testing Specific Questions

•Comparison against Saturated•Nested Models•AE Model•test significance of D

•E Model vs AE Model•test significance of A

•E Model vs ADE Model•test combined significance of A

& D

34

AE Model Deconstructed

AeModel <- mxModel( AdeFit, name="AE" )

Object containing Full Model

New Object ContainingAE Model

AeModel <- omxSetParameters( AeModel, labels=“d11", free=FALSE, values=0)

AeFit <- mxRun(AeModel)

The d path will not be estimated

Fixing the value of the c path to zero

35

Goodness of Fit Stats

ep -2ll df AICdiff -2ll

diffdf

p

Saturated

AE

DE no! no! no! no! no! no! no!

E

Goodness-of-Fit Statistics

ep -2ll df AICdiff -2ll

diffdf

p

Saturated

104055.9

3176

7521.9

3- - -

5177

3517.4

57.52 6

0.28

AE 34067.6

6177

4519.6

64.21 1

0.06

E 24591.7

9177

51041.

79528.

32

0.00

37

What about the magnitudes of genetic and environmental

contributions to BMI?

38

1- Use the data to test basic assumptions inherent to standard ACE (ADE) models

Saturated Model

2- Estimate contributions of genetic and environmental effects on the total variance of a phenotype

3- Test ADE (ACE) submodels to identify and report significant genetic and environmental contributions

AE or E Only Models

39

Estimated Values

a d e c a2 d2 e2 c2

AE -

E -

Estimated Values

a d e c a2 d2 e2

ADE 0.57 0.54 0.41 - 0.41 0.37 0.22

AE 0.78 - 0.42 - 0.78 - 0.22

E - - 0.88 - - - 1.00

41

Okay, so we have a sense of which model

best explains the data...or do we?

42

BMI Twin Correlations

0.30

0.78

A = 2(rMZ-rDZ)

C = 2rDZ - rMZ

E = 1- rMZ

43

Giving the ACE Model a Try

(if there is time)

44

Goodness-of-Fit Statistics

Model ep -2ll df AICdiff -2ll

diffdf

p

Saturated

ACE

45

Goodness-of-Fit Statistics

Model ep -2ll df AICdiff -2ll

diffdf

p

Saturated 104055.9

3176

7521.93 - - -

5177

3517.45 7.52 6

0.28

ACE 44067.6

6177

3521.66

11.73

60.07

46

Estimated Values

a d e c a2 d2 e2 c2

AE

ACE

AE

E

Estimated Values

a d e c a2 d2 e2 c2

ADE 0.57 0.54 0.41 - 0.41 0.37 0.22 -

AE 0.78 - 0.42 - 0.78 - 0.22 -

ACE 0.78 0.00 0.42 - 0.78 0.00 0.22 -

AE - - 0.56 0.68 - - 0.41 0.59

E - - 0.88 - - - 1.00 -

48

Conclusions

•BMI in young OZ females (age 18-30)

•dominance: borderline significant

•specific environmental factors: significant

•shared environmental factors: not

49

Potential Complications

•Assortative Mating

•Gene-Environment Correlation

•Gene-Environment Interaction

•Sex Limitation

•Gene-Age Interaction

For Your Reference: An Overall Map of the How Open Mx Processes Built the ADE Model as Coded in twinADEcon.R

Make matrices pathA, pathD, pathE

Do Matrix Algebra w Matrices covA, covD, covE, covMZ,covDZCall Data for Use in the Model dataMZ, dataDZ

Call Data for Use in the Model dataMZ, dataDZ

Build Model from Matrices/AlgebrasmodelMZ, modelDZ