The Impact of Gene–Environment Interaction and Correlation ......the slope of the parent-offspring regression, or the proportion of phenotypic variance due to genetic differences

REGULAR A RTI CLE

The Impact of Gene–Environment Interactionand Correlation on the Interpretation of Heritability

Omri Tal

Received: 16 November 2010 / Accepted: 10 August 2011

� Springer Science+Business Media B.V. 2011

Abstract The presence of gene–environment statistical interaction (GxE) andcorrelation (rGE) in biological development has led both practitioners and philos-ophers of science to question the legitimacy of heritability estimates. The paper

offers a novel approach to assess the impact of GxE and rGE on the way genetic andenvironmental causation can be partitioned. A probabilistic framework is developed,

based on a quantitative genetic model that incorporates GxE and rGE, offering arigorous way of interpreting heritability estimates. Specifically, given an estimate of

heritability and the variance components associated with estimates of GxE and rGE,I arrive at a probabilistic account of the relative effect of genes and environment.

Keywords Heritability � Quantitative genetics � Probability � GxE interaction �G–E correlation

1 Introduction

We may want to get an intuitive feeling for the ‘importance’ of genetic

variation in the population and a reasonable measure of this relative

importance is the broad heritability.—Lewontin (1975)

Can the broad heritability provide reliable intuition on the importance of geneticcauses in producing phenotypic variation? Heritability is variably expressed as the

correspondence between a latent genetic variable and a measurable phenotypic one,

the slope of the parent-offspring regression, or the proportion of phenotypic

variance due to genetic differences. This latter formulation is common in the

O. Tal (&)School of Philosophy and The Cohn Institute for the History and Philosophy of Science and Ideas,

Tel Aviv University, Tel Aviv 69978, Israel

e-mail: [email protected]

123

Acta Biotheor

DOI 10.1007/s10441-011-9139-8

literature, and perhaps the least confusing to conceptualize. However, the varianceis only one measure of population dispersion in a certain quantity. The standarddeviation (SD), specified in the underlying units of the target variables, is arguably amore natural description, and mathematically, a particular variance ratio VG/VPnecessarily implies a different (higher) ratio of SD. But even the SD is not naturally

congruent with how individual differences are commonly conceptualized, since it isessentially based on squared distances from the population mean, rather than on

pairwise differences. Moreover, heritability estimates are population measures, andconsequently say little about the relative impact of genetic and environmental causal

factors on individual phenotypic development. When nonlinear and interactiveelements are introduced into the developmental framework, the interpretation and

usage of heritability estimates become even more contentious. In a discussion on the

role of genes in development Rose (1999)criticizes the use of heritability estimates,

arguing that such estimates are meaningful only in the absence of gene–environment

interaction and under a random distribution of genotypes across environments,

If genotypes are distributed randomly across environments, it is possible to

estimate heritability, which defines the proportion of the variance which is

genetically determined. However, the mathematics only works if all the

relevant simplifying assumptions are made. If there is a great deal of

interaction between genes and environment, that is if genes behave according

to Dobzhansky’s (1973) vision of norms of reaction, if genes interact with

each other, and if the relationships are not linear and additive but interactive,

the entire mathematical apparatus of heritability estimates falls apart. Thus themeaningful application of heritability estimates is only possible in very specialcases, from which the majority of traits of interest outside the special world ofartificial selection are likely to escape [added emphasis].

The following observation may shed some light on Rose’s assertion. A partition of

the phenotypic variance into genetic and nongenetic components is related to the

correspondence of a latent variable to a measured one. The quantification of thecorrespondence between phenotypic and genotypic values is central to the analysis

of response to selection, familial resemblance and phenotypic development in

quantitative genetics (Lynch and Walsh 1998). A useful measure of the linearcorrespondence of P and G under an additive linear model (P = G ? E) is thesquared correlation coefficient, or the coefficient of determination. Formally, thismeasures the proportion of the variance in P that is explained by assuming that thetrue regression E[P|G] is linear (ibid., p. 47). From basic principles,

q2ðP; GÞ ¼ COVðGþ E; GÞrP � rG

� �2¼ VG þ COVðG; EÞð Þ

2

VP � VG¼ VG

VP:

Crucially, the coefficient of determination is equivalent to the common formulation

for broad heritability—the proportion of the phenotypic variance due to geneticvariation—only when the phenotypic model is additive and COV(G, E) is zero.

Discussions of gene–environment interaction have been mired with conceptual

confusion. The term ‘GxE interaction’ is often used loosely to mean that both genes

O. Tal

123

and environment contribute to the response variable, but quantitative geneticists

employ the term in the stricter statistical sense. Simply stated, GxE designates acontribution that some non-additive function of the hidden variables G and E makes tothe phenotypic value, independently of the main effects of these variables. It can also

be conceptualized as a relationship between the environment and a phenotype that

depends on the genotype, or alternatively a genotype-phenotype relationship that

depends on the environment (Carey 2002). Figure 1 is a depiction of GxE in terms ofnorms of reaction, reflecting possible relations between the underlying variables.

The term G–E correlation refers to the phenomenon where exposure toenvironment may have a genetic basis. Such correlation mainly occurs in

observational studies, whenever the environment cannot be randomly assigned

between genotypes in a controlled setting. More formally, G–E correlation is afeature of the distribution of genotypes within environments and exists whenever agenetic disposition leads individuals to develop under certain environments. The

present paper proposes a method of incorporating estimates of G–E interaction andcovariance within a probabilistic framework of genetic effects on individuals.

2 The Model

Quantitative genetic analysis generally proceeds by using measured phenotypic

variances and covariances to estimate latent variance components within the

framework of a postulated quantitative phenotypic model. The foundational discretemodel of GxE interactions is zijk = l ? Gi ? Ej ? Iij ? eijk where zijk denotes thevalue of the k’th replicate of genotype i (Gi) in environment j (Ej), Iij denotes theinteraction between genotype i and environment j, and eijk the specific environment,all terms with a mean of zero; To stress, eijk is the residual deviation of anindividual’s phenotype from the expectation of Gi ? Ej ? Iij, where the residualsare uncorrelated (Lynch and Walsh 1998, p. 108). A standard abstraction of thediscrete model is P = f(G, E), with a simplified model P = l ? G ? E ? I servingas a workable approximation. In this model the E variable represents both generalsystematic and specific nonsystematic environmental effects. In this respect, the

simplified linear model makes no distinction between epigenetic, systematic,

nonsystematic or stochastic effects.

Fig. 1 The norms-of-reaction for three separate developmental scenarios

Heritability, G 9 E and rGE

123

The partition of phenotypic variance follows directly from the standardquantitative genetic model (Falconer and Mackay 1996),

P ¼ Gþ E þ I ! VP ¼ VG þ VE þ VI þ 2COVðG; EÞ ð1Þ

where the population mean l is either assumed zero or subsumed in the geneticcomponent. Modeling the measured and latent variables requires several standard

assumptions. It is standard practice to assume the normality of the phenotypic

distribution for many quantitative traits (some traits, such as litter size or lifespan,

are clearly not normally distributed, but adequate transformations can be invoked).

The marginal normality of the two components I and G ? E then follows from thedecomposition of a normal random variable (P = G ? E ? I) and its stabilityfeatures under e-independence and deviations from normality (Tal 2009). Finally,

we adopt the standard assumption of the joint-normality of G and E. This is justifiedby observations of the linearity of statistical regressions, such as parent-offspring

regression (Lynch and Walsh 1998, p. 552; Tal 2009; Hill 2010), and a marginal

normality of G, due to high proportion of additive genetic variance for complextraits (Hill et al. 2008). Formally, the quantitative model requires,

[a] P is a standardized normally distributed trait[b] The joint distribution of G and E is bivariate normal with a possible covariance

term

[c] I is normally distributed[d] I is statistically independent of G ? E[e] Availability of estimates: heritability h2 = VG/VP, the fraction of VP due to

GxE interaction c2 = VI/VP, and a G–E correlation coefficient, q.

The goal is to arrive at an expression for prob(|G| [ |E| | P). Since we aremodeling only the deviations from the means we can standardize P and assume zeromeans for all variables without losing information. Thus VP = 1 and the variance ofG becomes h2. We then get from Eq. 1 and the above model assumptions thefollowing marginal distributions and a covariance term,

P�N 0; 1ð Þ; G�N 0; h2� �

; E�N 0; VEð Þ; I�N 0; VIð Þ; COV G; Eð Þ¼ q � h �

ffiffiffiffiffiffiVEp

: ð2ÞNaturally, all the variances and SD should be positive, i.e., h, c and rE [ 0,

where rE2 = VE. In contrast, the correlation coefficient is allowed full range,

-1 \ q\ 1. We would like to express rE in terms of the given estimates, h2, c2 and

q. From Eqs. 1 and 2 we have,

1 ¼ h2 þ r2E þ c2 þ 2qhrE: ð3Þ

Solving this as a quadratic equation for rE we get,

rE ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1� h2 1� q2ð Þ � c2

p� q � h: ð4Þ

A corollary of Eq. 4 is that h2 B (1 - c2)/(1 - q2), with a stricter upper boundh2 \ 1 - c2 for q[ 0, from rE [ 0. Since (from assumptions) the probability

O. Tal

123

density function (pdf) for G and E is bivariate normal with correlation of q, thegeneral expression is,

fh2;rE ;qðg; eÞ ¼1

2phrEffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1� q2ð Þ

p � e � 12� 1�q2ð Þ g2

h2�2qgh� erEþ

e2

r2E

h i: ð5Þ

To formalize the probability that |G| [ |E| we first need to arrive at the jointdensity function of G and E conditional on P, denoted F. From first principles ofconditional probability we have,

Fp;h2;c2;qðg; eÞ ¼ pG;EjPðg; ejpÞ ¼pG;E;Pðg; e; pÞ

pPðpÞ¼

pPjG;Eðpjg; eÞ � pG;Eðg; eÞpPðpÞ

¼ pIðp� g� eÞ � pG;Eðg; eÞpPðpÞ

: ð6Þ

Note that pIðp� g� eÞ ¼ pPjG;Eðpjg; eÞ since P = G ? E ? I; hence pPjG;Eðpjg; eÞhas same variance as I (c2) but a mean of g ? e. The pdf of a normal randomvariable X with zero mean and non-zero variance v (we assume G, E, and IGxE inEq. 1 are non-degenerate random variables) is given by

uvðxÞ ¼1ffiffiffiffiffiffiffiffi2pvp e�x2=2v: ð7Þ

In terms of uvðxÞ and f we have,

Fp;h2;c2;qðg; eÞ ¼uc2ðp� g� eÞ � fh2;rE ;qðg; eÞ

u1ðpÞð8Þ

which results in a bivariate normal pdf after explicit substitution and arranging ofterms (see ‘‘Appendix’’),

Fðg; eÞ ¼ 1

2pr1r2ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1� q212� �q e

�12� 1�q2

12ð Þg�l1r1

� �2�2q12

g�l1r1

� �� e�l2r2

� �þ e�l2r2

� �2

: ð9Þ

Figure 2 depicts F within the domain defined by |G| [ |E|.Finally, we employ F to express the conditional probability, prob(|G| [ |E| | P).

This probability is in fact the integral of F in the domain that satisfies |G| [ |E|,expressed as a function of h2, and denoted Mp;q;c2 h

2ð Þ,

Mp;q;c2 h2

� �¼ prob Gj j[ Ej jjP ¼ pð Þ ¼

Z1

�1

Zjgj

�jgj

Fp;h2;c2;qðg; eÞde dg

¼Z0

�1

Z�g

g

Fp;h2;c2;qðg; eÞdedgþZ1

0

Zg

�g

Fp;h2;c2;qðg; eÞde dg: ð10Þ

It can easily be shown that the probability is independent of the sign of P, i.e.,prob(|G| [ |E| | P = p) = prob(|G| [ |E| | P = -p) for any p; the probability is thesame, whether the phenotype is below or above the population mean. Figure 3a–d

depict several instances of M(h2) for combinations of P, GxE and rGE, as a functionof heritability.


123

It is instructive to observe some features that are immediately discernable from

the probability graphs. Figure 3a–c show a consistent pattern: as genes and

environment become more positively correlated, or as GxE effects increase, theprobability of |G| [ |E| is higher across the whole range of heritability values.Figure 3a and b depict a divergence of the curves as we traverse the heritability

axis: at low heritability estimates and for individuals close to the population mean,

the probability that |G| [ |E| is insensitive to the presence of either interaction orcorrelation effects. For instance, Fig. 3a shows that for phenotypic values close to

the mean, at a heritability range up to 0.4 there is less than a 10% change in the

resulting probability across a wide range of G–E correlation. A different pattern isdiscernable for high phenotypic deviations: the probability curves re-converge at the

high heritability range, as Fig. 3c depicts, for P = 3. This means that for highheritability values the probability that |G| [ |E| is less sensitive to the presence ofinteraction for individuals that deviate largely from the mean. Figure 3d shows how

the probability curve depends on the phenotypic value. Crucially, all the probability

curves intersect at a single heritability threshold corresponding to the neutral

probability of 50%, where prob(|G| [ |E| | P) = 0.5. It is also discernable that up tothis heritability threshold, for any combination of GxE and rGE, lower values of |P|correspond to a higher probability that |G| [ |E|; above that threshold the pattern isreversed—lower values of |P| correspond to lower probabilities.

The probabilities discussed so far are conditional on a phenotypic value. The

unconditional probability of |G| [ |E| is simply the average over the phenotypicdistribution. We denote by M the expected value of M across the population,

Mq;c2 h2

� �¼ prob Gj j[ Ej jð Þ ¼ EðMðpÞÞ ¼

Z1

�1

u1ðpÞMðpÞdp

¼ 1� 1ffiffiffiffiffiffi2pp

Z1

�1

e�p2

2 �Z0

�1

Z�g

g

Fðg; eÞdedgþZ1

0

Zg

�g

Fðg; eÞde dg

0B@

1CAdp:

ð11ÞFigure 4 depicts M for a particular instance of GxE and rGE.

1.00.5

0.0

0.5

1.0

G

1.0

0.5

0.0

0.5

E

0.0

0.5

1.0

1.5

FG

,E

Fig. 2 An instance of thebivariate normal density F,given h2 = 0.6, c2 = 0.1,q = 0.3 and p = 0.2, where F isrestricted to the domainsatisfying |G| [ |E|

O. Tal

123

We now turn to extracting similar probabilities from a purely additive model—

where GxE interaction is absent but G–E correlation is potentially present. Note wecannot merely plug c2 = 0 in expressions (10) and (11), since the general model inEq. 1 assumes that a GxE factor exists and has non-zero variance. However, theprobability M converges to M0 in Eq. 17 as c2 approaches zero. This will cover thosestudy designs that do not model GxE but still factor a variance component fromrGE. We therefore proceed in a similar fashion to Eq. 1, briefly,

P ¼ Gþ E! VP ¼ VG þ VE þ 2COVðG; EÞ: ð12ÞFormally, the quantitative framework is,

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

0.10.20.30.40.50.60.70.80.9

1

Heritability

Pro

b (|

G|>

|E| |

P)

P = 3, G−E correlation = 0, GxE

0.5

0.1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

0.10.20.30.40.50.60.70.80.9

1

Heritability

Pro

b (|

G|>

|E| |

P)

P= 0.5, G−E correlation = 0.2, GxE

0.5

0.1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Heritability

Pro

b (|

G|>

|E| |

P)

P = 0.25, GxE variance component = 0.2,

0.3

−0.3

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

0.10.20.30.40.50.60.70.80.9

1

Heritability

Pro

b (|

G|>

|E| |

P)

G−E correlation = 0.3, GxE variance

p=0

p=2

component = 0.1, P: 0 ... 2

variance component: 0.1 ... 0.5 G−E correlation −0.3 ... 0.3

variance component: 0.1 ... 0.5

A B

C D

Fig. 3 The graphs of prob(|G| [ |E| | P) as a function of the phenotypic value, G–E correlation and theGxE variance component. a The graphs of M(h2) for various values of G–E correlation. These are theprobabilities that the genetic deviation of an individual |G| had a greater effect than its environmentaldeviation |E| on its phenotypic deviation |P|, for p = 0.25 and GxE variance component of 0.2. At the lowheritability range the probability is insensitive to the presence of G–E correlation. b The graphs of M(h2)for various values of the GxE effect for p = 0.5 and a G–E correlation of 0.2. At the low heritabilityrange the probability is insensitive to GxE. c The graphs of M(h2) for various values of the GxE effect forp = 3 and a G–E correlation of 0. The probability curves converge at high heritability—the effect ofGxE is reduced. d The graphs of M(h2) for various values of P for a G–E correlation of 0.3 and aGxE variance component of 0.1. The probability curves intersect at a certain heritability value where theprobability is insensitive to the phenotypic value


123

[a’] P = G ? E is a normally distributed trait, standardized to unit variance andzero mean

[b’] G and E are bivariate normal, with possible non-zero covariance[c’] We obtain estimates of h2 and the rGE correlation coefficient q

This implies,

P�N 0; 1ð Þ; G�N 0; h2� �

; E�N 0; r2E� �

; COV G; Eð Þ ¼ qhrE: ð13ÞFrom Eqs. 12 and 13 we have 1 = h2 ? rE

2 ? 2qhrE and solving this as aquadratic equation for rE we get,

rE ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1� h2 1� q2ð Þ

p� q � h: ð14Þ

Note that q does not restrict the range of h2. The bivariate normal pdf for G and Ewith correlation of q, denoted f0, is,

f 0h2;qðg; eÞ ¼1

2phrEffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1� q2ð Þ

p � e �12� 1�q2ð Þ g2

h2�2qgh� erEþ

e2

r2E

h i: ð15Þ

To formalize the probability F0 that |G| [ |E| we need to arrive at the joint densityfunction of G conditional on P. From first principles of conditional probability andEq. 12 we have,

F0p;h2;qðgÞ ¼ pGjPðgjpÞ ¼pP;Gðp; gÞ

pPðpÞ¼

pEjGðp� gjgÞ � pGðgÞpPðpÞ

¼ pG;Eðg; p� gÞpPðpÞ

¼f 0h2;qðg; p� gÞ

u1ðpÞ¼ 1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

2p � h2r2E 1� q2ð Þp � e

� g�phffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1�r2

E1�q2ð Þ

pð Þ2

2�h2r2E

1�q2ð Þ : ð16Þ

0.0 0.2 0.4 0.6 0.80.0

0.2

0.4

0.6

0.8

1.0

Heritability

G−E Corr =0.3 , GxE Var component = 0.1 , averaged over P

Pro

b (|

G|>

|E|)

Fig. 4 The graph of M for a G–E correlation of 0.3 and a GxE variance component of 0.1: the probabilitythat |G| [ |E| averaged over the phenotypic distribution

O. Tal

123

F0 is expressed in Eq. 16 in a form easily recognizable as a normal pdf, where the

variance is h2r2E 1� q2ð Þand the mean is phffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1� r2E 1� q2ð Þ

p. The expression for

M0 ¼ prob Gj j[ Ej jjPð Þ is finally,

M0p;h2 ¼ prob Gj j[ Ej jjPð Þ ¼

R1p=2

F0p;h2;qðgÞdg if p� 0

Rp=2�1

F0p;h2;qðgÞdg if p\0

8>>><>>>:

: ð17Þ

The expected value of M0 across the population is denoted M0,

M0 h2� �

¼ prob Gj j[ Ej jð Þ ¼ E M0h2ðpÞ� �

¼Z1

�1

u1ðpÞM0h2ðpÞdp

¼Z0

�1

u1ðpÞZp=2

�1

F0p;h2;qðgÞdgdpþZ1

0

u1ðpÞZ1

p=2

F0p;h2;qðgÞdgdp

¼ 2Z1

0

1ffiffiffiffiffiffi2pp e

�p22

Z1

p=2

F0p;h2;qðgÞdgdp: ð18Þ

Figure 5a and b depict the probabilities generated within this additive framework

for the most basic scenario where G–E correlation is zero.

3 Discussion

The framework outlined in this paper allows incorporating estimates of gene–

environment interaction and covariance within a probabilistic interpretation ofheritability (Tal 2009; see Tal et al. 2010, for an extension that includes a putativeepigenetic variable). Specifically, given estimates of heritability and the variance

components associated with GxE and rGE, a method based on the standard quantitativemodel generates the conditional probability that genetic factors had a greater effect thanenvironmental factors on a deviation from the population mean. Previous approaches ofreformulating heritability arise from the application of more complex mixedquantitative models (see Oakey et al. 2007, for a ‘‘generalized heritability’’ thatincorporates pedigree information to form an extended pedigree model).

Two underlying assumptions of the present framework should be reemphasized.

First, the model takes as a point of departure the standard linear quantitative modelwith interaction, P = G ? E ? I, where the three latent terms are modeled ascontinuous variables. Second, the probabilistic framework relies on the availability of

estimates of variance components describing the non-additive effects. In this respect,

it is a methodological issue whether a particular sample size or genetic model hassufficient power to detect and quantify GxE for a target experimental design orobservational study. Indeed, it is well known that the power to detect interaction is

considerably less than that of the main effects (Wahlsten 1990; Sternberg and


123

Grigorenko 1997; Plomin et al. 2008). There are a few experiential strategies for

increasing the power of GxE detection, primarily, using a greater sample size.However, in controlled experimental designs the detection of GxE may be an artifactof the maximization of the phenotypic variance, due to the relatively higher additive

genotypic variance compared to natural randomly breeding populations. Consider-

ations of strain replication in the sample are also pertinent. Using high replication of

fewer genotypes in less environments yields greater sensitivity to GxE detection thanlow replication of more genotypes in more environments. This illustrates a basic

tension in defining research targets: whether to increase replication of a few

genotypes in a few environments or to better represent the range of genetic or

environmental variation possible at a cost to the sensitivity to detect small differences

(Hodgins-Davis and Townsend 2009). A related methodological issue with quanti-

tative phenotypes is the mathematical ability to induce or remove interaction effects

simply by transformation of scale (Wahlsten 1990). Ultimately, it is an empiricalissue whether GxE interaction exists with respect to a particular target trait, the givendistribution of genotypes and the range of environments.

An important theoretical issue whether the presence of any amount ofinteraction, or even the possibility of undetected interaction, renders the partition

of the phenotypic variance meaningless in terms of its causal explanatory content

(see for e.g., Lewontin 1974; Sarkar 1998, for a critical perspective; Sesardic 2005

for an opposing view; Oftedal 2005 for an attempt at conciliation; Tabery 2009 for

the relation to difference mechanisms). Arguably, there is no clear criterion thatallows distinguishing ‘‘strong’’ from ‘‘weak’’ interactions. Is it only when norms of

reaction cross, as in the rightmost graph of Fig. 1? Is it when the variance due to

GxE reaches a certain proportion of the total phenotypic variance? What is clear isthat attempting to separate the effects of genes and environment under substantialGxE is futile, since the interaction component is ultimately some unknowncombination of G and E. Given the standard model with interaction,

0.0 0.2 0.4 0.6 0.8 1.00.0

0.2

0.4

0.6

0.8

1.0

Heritability

Zero correlation and interaction,

0 0.2 0.4 0.6 0.8 10

0.10.20.30.40.50.60.70.80.9

1

Heritability

Pro

b( |G

| > |E

| | P

)

Pro

b (|

G| >

|E|)

Zero correlation and interaction,

p=0

p=0.25

p=2

for varions values of P: 0...2 Averaged over P

Fig. 5 a The graphs of M0 from Eq. 17 for various values of P (positive or negative deviations) when themodel used is P = G ? E and VP = VG ? VE, i.e., no GxE interaction or G–E correlation. Theprobability curves intersect at h2 = 0.5 and the probability for p = 0 is 0.5 irrespective of h2 (compare to

Fig. 3d). b The graph of M0 from Eq. 18, the probability that |G| [ |E| averaged over the distribution of P

O. Tal

123

P = G ? E ? I, if I is large compared to G and E, our probabilistic frameworkcannot capture the full relation between the latent genetic and environmental values.On the other hand, the presence of strong G–E correlation does not pose suchdifficulty. This follows from the fact that the correlation is not a separate component

of the phenotypic value, P = G ? E ? I, but only a characteristic of the jointdistribution of G and E, which is fully captured by the conditional probabilityprob(|G| [ |E| | P).

Finally, an application of the probabilistic framework would involve plugging the

various estimates of variance components in the probability function M in Eq. 10, or

M in Eq. 11. For instance, if rGE = 0.3 and the variance portion due to GxE is 0.2,then for a heritability of 0.7 the probability that |G| [ |E| is 0.8 for individuals at �SD from the population mean (see Fig. 3a). It is instructive to compare such results

with the probabilities generated from a model that ignores GxE and rGE, via M0 inEq. 17. Using, for comparative illustration, a phenotypic deviation of � SD andheritability of 0.7, we have M0(0.7) = prob(|G| [ |E| | P = �) = 0.55, quite incontrast with the higher probability M(0.7) = 0.8, based on a model with some GxEand rGE effects. In a similar fashion, one could compare the averaged probabilities

from M in Eq. 11 with M0 in Eq. 18. Such comparisons are most pertinent in thecontext of studies that employ a dual design: a reduced-fit additive model thatignores GxE and rGE effects, and a better-fit model that incorporates these effects.The probabilities generated using variance component estimates from the better-fit

model may better describe the relative weight of genes and environment on the

deviation of a target trait from the population mean.

Acknowledgments I would like to thank Jim Tabery, Eva Jablonka, John Loehlin, John C. DeFries,Neven Sesardic, Samir Okasha, Tamir Tassa and two anonymous reviewers for insightful feedback and

suggestions.

Appendix: Derivation of the Bivariate Normal Form for F

We wish to express F from Eq. 8 such that it complies with a bivariate normal formEq. 9, if indeed it can be expressed as such. Towards that end we need to find

expressions for r1, r2, l1, l2 and q12 in terms of the parameters of F. Therefore, wewrite F such that terms involving powers of its independent variables g and e areseparately expressed,

Fðg; eÞ ¼ 12p � h � c � rE �

ffiffiffiffiffiffiffiffiffiffiffiffiffi1� q2

p � e X2�c2 1�q2ð Þh2r2E ð19Þwhere,

X ¼ �g2 1� q2� �

h2hr2E þ c2r2E� �

� e2 1� q2� �

h2r2E þ c2h2� �

þ ðgþ eÞ 2p 1� q2� �

h2r2E� �

� 2ge 1� q2� �

h2r2E þ c2qhrE� �

� p2 1� q2� �

h2r2E 1� c2� �

ð20Þ

and where rE ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1� h2 1� q2ð Þ � c2

p� q � h, as derived in Eq. 3.


123

Equating similar terms in Eq. 9 with Eqs. 19 and 20 leads to seven simultaneousequations for the five unknowns. For instance, equating the g2 term, we get,

�g2 1� q2ð Þh2r2E þ c2r2E� �2 � c2 1� q2ð Þh2r2E

¼ �g2

2 1� q212� �

r21

and similarly for the terms related to e2, g, e, ge, the constant within the exponentand the coefficient of the exponent. The mean, variance and covariance parameters

for F resulting are,

q12 ¼ S �ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1� c

2 1� q2ð Þ1� q2ð Þr2E þ c2ð Þ 1� q2ð Þh2 þ c2ð Þ

s

where,

S ¼�1 if q[ c

2�ffiffiffiffiffiffiffiffiffiffiffiffiffiffic4þ4h2r2Ep

2hrE

0 if q ¼ c2�

ffiffiffiffiffiffiffiffiffiffiffiffiffiffic4þ4h2r2Ep

2hrE

þ1 if q\ c2�

ffiffiffiffiffiffiffiffiffiffiffiffiffiffic4þ4h2r2Ep

2hrE

8>>><>>>:

and,

r1 ¼ hffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi�1� q2

�r2E þ c2

q; r2 ¼ rE

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi�1� q2

�h2 þ c2

q;

l1 ¼p

c2

��1� q2

�r2E þ c2

�h2

�ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi��

1� q2�r2E þ c2

��1� q2

�h2 þ c2

�� c2

�1� q2

q �� rEh

!;

l2 ¼p

c2

��1� q2

�h2 þ c2

�r2E

�ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi��

1� q2�r2E þ c2

��1� q2

�h2 þ c2

�� c2

�1� q2

�q� rEh

!;

noting that rE itself is a function of the input parameters, p, h2, c2 and q2, expressed

in Eq. 3.

References

Carey G (2002) Human genetics for the social sciences, 1st edn. Sage, Thousand Oaks

Dobzhansky T (1973) Nothing makes sense except in the light of evolution. Am Biol Teach 35:125–29

Falconer DS, Mackay TFC (1996) Introduction to quantitative genetics, 4th edn. Longmans Green,

Harlow

Hill WG (2010) Understanding and using quantitative genetic variation. Philos Trans R Soc B 365:73–85

Hill WG, Goddard ME, Visscher PM (2008) Data and theory point to mainly additive genetic variance for

333 complex traits. PLoS Genet 4(2):e1000008. doi:10.1371/journal.pgen.1000008

O. Tal

123

http://dx.doi.org/10.1371/journal.pgen.1000008

Hodgins-Davis A, Townsend JP (2009) Evolving gene expression: from G to E to GxE. Trends Ecol Evol24(12):649–658

Lewontin RC (1974) The analysis of variance and the analysis of causes. Am J Hum Genet 26:400–411

Lewontin RC (1975) Genetic aspects of intelligence. Annu Rev Genet 9:382–405

Lynch M, Walsh B (1998) Genetics and analysis of quantitative traits, 1st edn. Sinauer Associates,

Sunderland

Oakey H, Verbyla AP, Cullis BR, Wei X, Pitchford WS (2007) Joint modeling of additive and non-

additive (genetic line) effects in multi-environment trials. Theor Appl Genet 114:1319–1332

Oftedal G (2005) Heritability and causation. Philos Sci 72:699–709

Plomin R, DeFries JC, McClearn GE, McGuffin P (2008) Behavioral genetics, 5th edn. Worth, New York

Rose S (1999) Précis of lifelines: biology, freedom, determinism. Behav Brain Sci 22(5):871–885

Sarkar S (1998) Genetics and reductionism. Cambridge University Press, Cambridge

Sesardic N (2005) Making sense of heritability. Cambridge University Press, Cambridge

Sternberg RJ, Grigorenko E (1997) Intelligence, heredity, and environment. Cambridge University Press,

Cambridge

Tabery J (2009) Difference mechanisms: explaining variation with mechanisms. Biol Philos 24:645–664

Tal O (2009) From heritability to probability. Biol Philos 24:81–105

Tal O, Kisdi E, Jablonka E (2010) Epigenetic contribution to covariance between relatives. Genetics

184:1037–1050

Wahlsten D (1990) Insensitivity of the analysis of variance to heredity-environment interaction. Behav

Brain Sci 13:109–161


123

The Impact of Gene--Environment Interaction and Correlation on the Interpretation of HeritabilityAbstractIntroductionThe ModelDiscussionAcknowledgmentsAppendix: Derivation of the Bivariate Normal Form for FReferences

/ColorImageDict > /JPEG2000ColorACSImageDict > /JPEG2000ColorImageDict > /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 149 /GrayImageMinResolutionPolicy /Warning /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 150 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict > /GrayImageDict > /JPEG2000GrayACSImageDict > /JPEG2000GrayImageDict > /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 599 /MonoImageMinResolutionPolicy /Warning /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 600 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict > /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile (None) /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False

/CreateJDFFile false /Description > /Namespace [ (Adobe) (Common) (1.0) ] /OtherNamespaces [ > /FormElements false /GenerateStructure false /IncludeBookmarks false /IncludeHyperlinks false /IncludeInteractive false /IncludeLayers false /IncludeProfiles false /MultimediaHandling /UseObjectSettings /Namespace [ (Adobe) (CreativeSuite) (2.0) ] /PDFXOutputIntentProfileSelector /DocumentCMYK /PreserveEditing true /UntaggedCMYKHandling /LeaveUntagged /UntaggedRGBHandling /UseDocumentProfile /UseDocumentBleed false >> ]>> setdistillerparams> setpagedevice

Documents

The Impact of Gene–Environment Interaction and Correlation ......the slope of the parent-offspring regression, or the proportion of phenotypic variance due to genetic differences