15
Am J Hum Genet 26:489-503, 1974 Analysis of Family Resemblance. III. Complex Segregation of Quantitative Traits N. E. MORTON' AND C. J. MACLEAN1 In previous papers of this series, we have introduced problems of family resem- blance, which include segregation and path analysis, group differences expressed in families of hybrid ancestry, linkage, mutation screening, parentage exclusion, and recurrence risks [1, 2]. Combination of Wright's path analysis [3] with Fisher's normalizing z transformation [4] was stressed as a means to discriminate among genetic factors, common and random environment, and their second-order effects (gene-environment correlation and assortative mating), both within and among in- tercrossing racial groups. Path analysis provides optimal discrimination between common environment and genetic factors, while segregation analysis is better for distinguishing major loci. Here we shall consider complex segregation analysis of quantitative traits under a model which includes polygenes, a major locus, and both random and common environment. Most applications of segregation analysis have involved a dichotomy (normal versus affected), which not only has low power to discriminate complex hypotheses but is also out of touch with the needs of genetic counseling [5]. An individual apprehensive of diabetes should submit to a glucose-tolerance test because it has greater predictive power than the occurrence of diabetes among his relatives. More- over, where familial data are useful, quantitative traits of relatives are often more informative than their affection status. Clearly quantitative as well as qualitative information should be used for both analysis and counseling. When full quantifica- tion is not feasible, a trichotomy (normal, intermediate, affected) provides more information than two states. In addition to a major locus, polygenic variation, and environmental effects, our model incorporates a hypothesis about the relation of the quantitative trait to affec- tion. Elston and Stewart [6] discussed mixed models as a generalization of poly- genic and major-locus cases and concluded that "extension to the more general genetic models presents no theoretical problems, but may depend on practical ad- vances in computer technology." We have found that the mixed model is within current capabilities. Although not including such complications as multiple alleles and two or more major loci, this model has the advantage of a manageably small Received October 30, 1973; revised January 21, 1974. PGL paper no. 113. This work was supported by grant 1 RO1 HD 06003 from the U.S. Na- tional Institutes of Health. 1 Population Genetics Laboratory, University of Hawaii, Honolulu, Hawaii 96822. ® 1974 by the American Society of Human Genetics. All rights reserved. 489

of Quantitative Traits

  • Upload
    lamnhi

  • View
    223

  • Download
    0

Embed Size (px)

Citation preview

Page 1: of Quantitative Traits

Am J Hum Genet 26:489-503, 1974

Analysis of Family Resemblance. III. Complex Segregationof Quantitative Traits

N. E. MORTON' AND C. J. MACLEAN1

In previous papers of this series, we have introduced problems of family resem-blance, which include segregation and path analysis, group differences expressed infamilies of hybrid ancestry, linkage, mutation screening, parentage exclusion, andrecurrence risks [1, 2]. Combination of Wright's path analysis [3] with Fisher'snormalizing z transformation [4] was stressed as a means to discriminate amonggenetic factors, common and random environment, and their second-order effects(gene-environment correlation and assortative mating), both within and among in-tercrossing racial groups. Path analysis provides optimal discrimination betweencommon environment and genetic factors, while segregation analysis is better fordistinguishing major loci. Here we shall consider complex segregation analysis ofquantitative traits under a model which includes polygenes, a major locus, and bothrandom and common environment.Most applications of segregation analysis have involved a dichotomy (normal

versus affected), which not only has low power to discriminate complex hypothesesbut is also out of touch with the needs of genetic counseling [5]. An individualapprehensive of diabetes should submit to a glucose-tolerance test because it hasgreater predictive power than the occurrence of diabetes among his relatives. More-over, where familial data are useful, quantitative traits of relatives are often moreinformative than their affection status. Clearly quantitative as well as qualitativeinformation should be used for both analysis and counseling. When full quantifica-tion is not feasible, a trichotomy (normal, intermediate, affected) provides moreinformation than two states.

In addition to a major locus, polygenic variation, and environmental effects, ourmodel incorporates a hypothesis about the relation of the quantitative trait to affec-tion. Elston and Stewart [6] discussed mixed models as a generalization of poly-genic and major-locus cases and concluded that "extension to the more generalgenetic models presents no theoretical problems, but may depend on practical ad-vances in computer technology." We have found that the mixed model is withincurrent capabilities. Although not including such complications as multiple allelesand two or more major loci, this model has the advantage of a manageably small

Received October 30, 1973; revised January 21, 1974.PGL paper no. 113. This work was supported by grant 1 RO1 HD 06003 from the U.S. Na-

tional Institutes of Health.1 Population Genetics Laboratory, University of Hawaii, Honolulu, Hawaii 96822.

® 1974 by the American Society of Human Genetics. All rights reserved.

489

Page 2: of Quantitative Traits

MORTON AND MAC LEAN

set of parameters, flexibility to fit a wide variety of data, and ability to discriminatemajor loci from polygenes and common sibling environment by standard hypothesis-testing procedures. Much more important than estimating parameters under thismodel is validation of the various parts of the model itself, and all emphasis hasbeen placed upon tests of hypotheses.

In this paper we restrict attention to nuclear families consisting of parents andtheir children. A subsequent paper will give the extension to larger pedigrees. Weregret that limitations of the Roman alphabet force us to reassign symbols whichwere previously used with different connotations in path analysis [2]. Greek letterswill be reserved for genotypes and phenotypes, lower case Roman letters for theireffects, and Roman capitals for variance components, incidences, and thresholds.

THE MODEL

Consider a quantitative trait x, such that

x=g+c+e, (1)

where g is the effect due to a major locus, c is the breeding value due to an indefi-nitely large number of additive genetic factors, and e is the environmental contribu-tion (fig. 1). The three sources act independently. The environmental effect is

93 U 92 91

FiG. 1.-The mixed model for a quantitative trait

partitioned into two components, e = e8 + er, where e8 is common within a sibshipand er is random. These effects are assumed independently normal, each with meanzero and variances S and R. respectively, which are to be estimated.The total environmental effect is therefore also normal with mean zero and

var(e) -S+R E.The polygenic effect c is normally distributed with mean zero and variance C,

also to be estimated. We shall attribute only additive variation to c, assuming thatnonlinear effects are negligible. The polygenic effect may be partitioned into twocomponents,

490

Page 3: of Quantitative Traits

SEGREGATION OF QUANTITATIVE TRAITS

c= b+y, (2)

where b is the midparental breeding value and y is the individual deviation. Bothb and y are normally distributed with mean zero, and under panmixia each inde-pendently contributes half the variance of c:

var (b) var (y) - C/2. (3)

The major locus has the characteristics shown in table 1. We refer to t as the

TABLE 1

CHARACTERISTICS OF MAJOR Locus

GENOTYPE

CHARACTERISTIC rr rFy yy

Index,i ................................. 1 2 3Frequency, pi ............................ q2 2q (1 - q) (1- q)2Effect, g, ................................ z+ t z + td Z

displacement at the major locus and to d as the degree of dominance. We shall taket > 0 so that substitution of F for y represents a positive contribution to the traitx. This implies that d - 0 corresponds to a recessive contribution and d - 1 to a

dominant one. A locus with d- 'Y2 we shall call additive. In figure 1, t is largewith respect to the variation in x, and d is somewhat less than '2. The major locushas mean E(g) - z + q2t + 2q(1 - q)td, which we shall call u. Since u is oftenmore easily estimated than z, we use it instead of the parameter z by the substi-tution z - u - q2t - 2q(1 - q)td. The variance of the major locus is var(g)-q2(z + t)2 + 2q(1 - q) (z + td)2 + (1 - q)2z2 - U2, which we shall call G. Themajor locus parameters to be estimated are u, t, q, and d. We shall be interested inthe hypothesis that the effect of the major locus is megaphenic: that is, t is largerelative to the phenotypic standard deviation [ 7].The parameters of the three sources defined by equation (1), g, c, and e, are to

be inferred from samples of the quantitative trait x. Let us therefore consider thedistribution of x dependent upon them. By setting the means of both c and e atzero, we equate the mean of x to the mean of g,

E(x) =u. (4)

This involves no loss of generality. The three contributions to x vary independentlyso that var(x) - var(g) + var(c) + var(e). We shall let V stand for the varianceof x. Then

V_ G + C+ E. (5)

Conditional upon a particular value g, x is normally distributed as the sum of twonormal variables, with E(xlg) - g and var(xjg) = C + E.

491

Page 4: of Quantitative Traits

MORTON AND MAC LEAN

In figure 1, each genotypic class is shown separately, with mean gi and standarddeviation from that mean /IjTE, the same for every class. The sum of thesethree classes gives the total distribution of x, with mean u and standard deviation

P(x) EZ..~ Ptt'K -HUE (6)

where, as throughout this paper, f(h) is defined with its differential element f(k)(1/\/:)exp(_k2/2)dk. The probability P(x) is assumed either to contain adifferential element or not, whichever is appropriate. For parents we shall need thedistribution of x conditional upon both a value at the major locus and a value forthe polygenic effect. This conditional distribution is normal, and if the entire breed-ing value c is given, only e is free to vary: E(xIg,c) = g + c and var(xIg,c) E.

For children we need the conditional distribution when not the entire breedingvalue but only the midparental breeding value and environment common to sibsare given. This is the variation among siblings when all family parameters are given.In this case er and y both vary, and from equation (3), E(x~g,b,e8) g + b + e8and var(xlg,b,e,) C/2 + R. Since both b and r are normally distributed, theirsum is conveniently treated as a single variable, v = b + e8. Then E(xlg,v) g +v and var(xIg,v) = C/2 + R.

These are the necessary elements for deriving the relationship of a measuredquantitative variable to its causes under a mixed genetic model. Next consider therelation between this quantitative trait and affection.

Liability

We suppose that the trait x is related to affection through liabilityw=x +k, (7)

where k is a normally distributed environmental effect with mean zero and varianceW. Therefore the conditional distribution of w, given x, is normal with E(wlx) = xand var(wjx) -W. In principle w cannot be measured (unless W = 0), but rathera threshold, Z, yields affection when w > Z [8]. Therefore, liability determines arisk function which we shall call

Q(a) 1 f h2/2dh, (8)V/271 a

where a = (Z - x)/V/W. This subsumes affection like diabetes, which could bedefined on the trait of glucose intolerance, with Z specifying the value of glucoseintolerance above which an individual is classified as diabetic. In this case W = 0and Q(a) = 1 if x > Z; Q(a) = 0 if x < Z. Also included is affection like myo-cardial infarction, which cannot be defined on serum cholesterol but is associatedwith hypercholesterolemia. Then /V/(V + W) is the correlation between choles-terol (x) and phenotypic liability to infarction (w).

492

Page 5: of Quantitative Traits

SEGREGATION OF QUANTITATIVE TRAITS

In order to treat pedigrees in which not all members are measured for x, we shallneed the relationship of w directly to the genetic factors. For x not measured, wewrite equation (7) in the form w - g + c + e + k. This case is illustrated in

FIG. 2.-The mixed model when the quantitative trait is not measured

figure 2. The conditional distribution of w, given a value for the major locus, isnormal, since c, e, and k are normal and they enter w as a sum. The mean andvariance are E(wIg) - g and var(wIg) - C + E + W. Since g takes on onlythree possible values, the total incidence of affection is

3

A E p1Q(ai), (9)i=i

where a? = (Z - gi)/VC + E + W. An intermediate phenotype in a trichotomyis specified by a second threshold, M, with incidence

3

I=Zp iQ(mi) -A, (10)i=l

where mi (M - gi)! C ± E + W. The distribution of w conditional upon bothmajor locus g and breeding value c is normal with E(wlg,c) g + c andvar(wig,c) E + W. If the midparental breeding value and common sibling en-vironmental effect (b + e= v) are given, E(wlg,v) g + v and var(wlg,v)C/2 + R + W.

These results constitute the distribution theory required to calculate the likeli-hood of pedigrees under the mixed genetic model. The primary calculation of thispaper, that is, the probability that an individual have phenotype 4 when v and gare given, has a different form corresponding to each of the six possible observationsof 4 (table 2).

LIKELIHOOD CALCULATIONS

Inference on the parameters of the mixed model is derived from the likelihoodof sibships given the phenotypes of their parents. The first step is to calculate the

493

malft-

Page 6: of Quantitative Traits

MORTON AND MAC LEAN

TABLE 2

PROBABILITY THAT INDIViDUAL HAS PHENOTYPE 4 WHEN V AND g ARE GIVEN

0 POlPQIg, v)

Not observed ...................

Quantitative value x ............

Affected ........................

Intermediate ...................

Normal, intermediate undefined ...

Normal, intermediate defined ....

1

xx-g-vf

+VC/2 + R +W

1 M-g-vQ( Zg-v)-\/C/2 + R + W J VC12 + R + W

ZZ-g-v1 QVC/2 ~+R+ W

1- g

l

\ +C/~2 ~+R + W

probability of a pair of major genotypes and a midparental breeding value condi-tional upon phenotypes of the parents. The second step is to calculate the corre-

sponding probability of the sibship conditional upon the parental major genotypes,midparental breeding value, and environment common to sibs. The final step isto total the conditional probabilities weighted by the corresponding parental prob-abilities over all possible combinations of major genotypes, midparental breedingvalues, and common sibling environmental effects.

Parental Probabilities

The assumption of panmixia allows us to neglect inbreeding and to treat parentsas independently distributed. The midparental breeding value b is the mean of thebreeding values of the parents, Cfa and cGo,

(11)b - Cfa + CMGo2

Transformation of variables gives us the distribution of b from the distribution ofthe parents' breeding values, P(b) = 2 fPfa(h)Pmo(2b - h)dh, where the coeffi-cient 2 comes from the Jacobian of the transformation. Therefore, assuming thatequation (1) holds for parents as well as children, the probability of a pair of

494

Page 7: of Quantitative Traits

SEGREGATION OF QUANTITATIVE TRAITS

genotypes gj X gk and a midparental breeding value b conditional on the parentalphenotypes is

0oo0 h\ ( 2b -hPiPkJI v/ICf N

) P(4falgj, h)P(4ntJg1 , 2b - k) dh(jkb

2iP(Ofal (Amo) ( 12 )

By using the fact that

lftjkb db 1,i k

we replace evaluation of the denominator by norming.The evaluation of the integral in formula (12) depends upon the kind of obser-

vation made on the parental phenotypes. If the parents have either quantitativevalues or are unobserved, the probabilities are all normal and therefore their con-volution is also normal. In this case 4jkb can be written analytically just by calcu-lating the mean and variance, as follows. Consider a single parent with phenotypemeasured x. Then the probability that he has genotype g and polygenic breedingvalue c is

P(g,cIx) = P(glx) P(cjg,x). (13)

For the first term, P(glx) P(xlg)p0/P(x). The denominator is accounted forby norming at the end of the procedure. The second term of formula (13) is writtenP(clx,g) = P(c) P(xjc,g)/P(x~g). Since these three are all normally distributed,we have

1 C2 [X _ Q+C)]2 (X _g)2)P(cIx,g) c exp(-l + ( (xE)I}

= p42 [ (C+E)(g) (C+E)}

That is, c conditional on x and g is normally distributed with E(clg,x) [C/(C + E)] (x - g) and var(cIg,x) = 1/(1/C + 1/E). The same type of argumentapplies to the other parent if also measured for the quantitative trait. Then fromequation (11), under panmixia, b is distributed normally with

C (Xfa + Xmo) (gj + gk)E(b 1g9, gk, Xfa, Xmo) = C+E L 2

1 ~~~~~(14)var (bIgj, gkl Xfay Xmo)- 1 (14)

2(1/C ± 1l/E)

The probability of a breeding value c with 4 not observed is the prior f(c/xJW)independent of the major locus, so that the distribution of b with one parent un-known and the other measured for x is normal with

495

Page 8: of Quantitative Traits

MORTON AND MAC LEAN

E(b gj, gk, Xfa, o- C-C E (Xfa -gj)

C+ C/E

var (blgj,gk, Xfa, 4Nmo = ? ) 2 (C/+E (

The distribution of b when both parents are unknown is just the prior f(b/ C/2).If + affected (aff.), then

P(g) P(c) P(aff.1g, c) pfJ(c/VC) Q{ [(Z - g) -c]/N/E+WP(aff.) A

(16)

Since this apparently has no closed form, it must be treated numerically. But sinceall affected people in the sample have this same posterior distribution, the laboriouscalculation need be carried out only once for a set of parameters. The probabilitiesfor the intermediate and nonaffected conditions are similar to that shown in equa-tion (16).

Common Sibling Environment

Since the common sibling environmental effect is assumed independent of pa-rental genetic factors, the probability distribution of all family parameters is theproduct of gjkb and the prior probability of e8.

For simplicity, we may take the two continuous variables b and e, together,v - b + e8, and use the convolution fjk=Xjkb P(e, - v - b) db. This is easywhen 4 is normal because of the convolution law mentioned above. Therefore, whenboth parents have x values given, equation (14) yields

c (Xfa + X111) - (gj + gk)E(vjgj, gk, Xta, xmo) = E (2±

1var (v| gjy gkl Xfay Xmo) =2(1C + 1/E)

When one parent is unknown, equation (15) yields

E(vlgjgkyXfa,4mo ?) - C (Xfa gj)C-HE

C tC12 + EXvar(vggjgk, Xfa, ct4o -?) - (SC+E)±.

When both parents are unknown, P(v) - f(v/VC/2 + S). The cases involvingaffection probabilities must be treated numerically, but only once for all sibships.

Conditional LikelihoodThe second step in the calculation of sibship likelihood is to evaluate the prob-

ability of the children's phenotypes conditional upon each possible set of family

496

Page 9: of Quantitative Traits

SEGREGATION OF QUANTITATIVE TRAITS

parameters gj, gk, and v. The calculation is facilitated by the fact that, given theseparameters, the sib values are independently distributed and may therefore betreated separately.

Let PiJk be the probability that a child have major locus index i when the pa-rental mating type is gj X gk (table 3).

TABLE 3

PROBABILITY (PiJk) THAT CHILD HAS MAJOR LocusINDEX i WHEN PARENTAL MATING TYPE IS g9 X g9

1 2 3 jk

1 0 0 1 11/2 1/2 0 1 20 1 0 1 31/2 1/2 0 2 11/4 1/2 1/4 2 20 1/2 1/2 2 30 1 0 3 10 1/2 1/2 3 20 0 1 3 3

The probability that the nth member of a sibship have phenotype 4>n given thevalues gj, gk, and v is

3

Z PiJkP(nIlgIV,v).

Since the sibs are independent under gj, gk, and v, the conditional probability of theentire set of s sibs is just the product

s 3

TI Z pjkP(onlIgi, V)n=l t=l

Finally, the likelihood of the sibship is the total of the conditional probabilitiesweighted by the parental probabilities,

3 3 3

L f I 1Ejkv1JZE kjkP((AnIgi v) dv. (17)-oo j=-l k=1 n=1 i=1

If there is no major locus effect, equation (17) can be calculated analytically,but in general it must be treated numerically. For these calculations continuousvariables are represented in polychotomized form, and integration is approximatedby summation [9]. Although of little theoretical interest, numerical techniquesaffect the entire feasibility of the procedure. We use standardized normal deviates

497

Page 10: of Quantitative Traits

MORTON AND MAC LEAN

z, partitioned into n intervals. For convenience the (n - 2) interior intervals areof equal length and are represented by their midpoints. The time required by cal-culations depends heavily upon the number of intervals of polychotomization. Alarge numerical experiment was performed to determine the minimum value nwhich yielded accurate computations. Based upon the results, the size of theinterior intervals was set at .25 SD. The two exterior intervals are bounded by-+- 4 SD and are represented by ± f(4)/Q (4), respectively. Numerical tests of 5, 6,and 10 SD showed almost no discernible increase in accuracy.

STATISTICAL TECHNIQUES

It is by maximizing the likelihood equation (17) that we estimate the values ofthe parameters of the mixed genetic model, and, more important, it is by the ratioof likelihoods under various restrictions that we test the applicability of its elementsto data. We perform maximum-likelihood estimation by Newton-Raphson iterationupon the following: V, variance of x; u, mean of x; d, degree of dominance at majorlocus; q, gene frequency at major locus; t, displacement at major locus; H, poly-genic heritability (CIV); and B, relative variance due to common environment(S/V). While maximizing likelihood for seven parameters simultaneously wouldseldom be useful, there are many relations inherent in the mixed genetic modelwhich can be used to partition the problem so that calculations can be performedwith accuracy and within reasonable time. Maximum-likelihood scores are cal-culated only for parameters to be iterated. We assume that the population affec-tion rates A and I are known from other evidence, and that W, Z, and M have beenestimated separately as in equations (20) or (21) below.More important than estimating parameters under a given model is validation

of the model itself. The mixed model provides tests of separate hypotheses by thestandard likelihood-ratio method. If l ln Lo denotes the maximal logarithmic likeli-hood of a sample under a null hypothesis which specifies p parameters, and I ln Lis the unconditional maximum with these parameters estimated, then in largesamples the quantity 2 (, In L - ln L0) tends to the x2 distribution with p df[10]. This provides a means to test a prior subhypothesis, such as H 0. More-over, since the model specifies the entire distribution of this trait, within andamong families, heterogeneity tests can be applied to the unconditional model itself.For example, if there were not one major locus but several, acting independently,there would tend to be heterogeneity in the estimate q between families with dif-ferent values for the parents [11]. The sample could be partitioned into familiesin which both parents have x values above the mean, families with one parentabove and one below the mean, and families with both below. The test of q^1q2 q3 would have 2 df.

Simulation studies [12] have shown that quantitative data treated by this ap-proach yield much greater information than does affection status. Simulated testsfor the presence of a megaphenic locus, significant polygenic effect, and so on, werepowerful and indicate that this procedure should be practical when applied toreal data.

498

Page 11: of Quantitative Traits

SEGREGATION OF QUANTITATIVE TRAITS

Sample Selection

In the simplest case, called complete selection, the individuals may be randomlydrawn from the population, chosen on the basis of particular x values and subse-quently examined for affection, or they may be the children or parents (but notsibs) of probands [13]. The likelihood equation (17), since it is a function condi-tional upon the parental phenotypes, is invariant with respect to parental values.Therefore, if the sample is assembled strictly on parental values, the populationparameters may be estimated regardless of the criterion for sample selection. Thesituation is slightly more complicated under sampling of index sibships containingone or more probands. In this case, called incomplete selection, the likelihood mustbe adjusted for the ascertainment probability of each sibship. If ascertainment isa function of phenotype, let ir(4n) be the probability that n is a proband. Theprobability that a sibship of size s with phenotypes 41, . . ., 48 be ascertained is

n=1

Therefore the likelihood equation (17), allowing for ascertainment, yields

8 8

{1 fJ [ 1 - kn) ][} f.XXfjkv l [Pijk P(4nfgi,V) ] dvL . (18)~~~C EE jkv [l -X () Y-~ijk N019gi, V) d(PJ8 dv

j k i

For the usual case where the sibship is selected through affection with ascertainmentprobability 7r, equation (18) becomes

[1 -(1 7)r] Il; l kv I[ik P((_nIgi V_) I dvL--~~~~~~~~~~kn-

1 C jv{l - VY-Pijk Q[(Z - 9-v)/VC/2 + R + W]}8 dv; k i

where r is the number of affected members of the sibship [14].

Separate Inference from the Subsample with Quantitative Trait Observed

Information about the mean and variance of the quantitative trait x comes fromthe subsample with x measured, since affection status contains no information aboutthese parameters. When there is no quantitative information, we set u = 0 andV = 1. If there is no ascertainment bias, the mean and variance of the sample pro-vide good estimates of u and V. Under incomplete selection the general likelihoodequations provide maximum-likelihood estimates, but this is a poor design for deter-mination of these population parameters. Usually in such a case, u and V would beknown from other studies, or a control sample of normal families would be used inaddition to the sample under incomplete selection.

Since the model specifies a purely additive polygenic effect, the dominance devia-

499

Page 12: of Quantitative Traits

MORTON AND MAC LEAN

tion should theoretically yield information about the major locus. Under the mixedmodel, the effect of substituting gene F for gene y is a1 = glq + g2( 1 - q) - u =t(l - q) [q + (1 - 2q)d], and the substitution of y for r gives a2 = g3( - q)+ g2q - u -tq[q + (1- 2q)d]. The breeding values of the three genotypesare bvi= 2a =2(1-q)t[q+ (1-2q)d]; bv2=a,+a2= (1-2q)t[q+(1 - 2q)d]; and bV3= 2a2 =-2qt [q + (1 - 2q)d]. Thus the additive varianceof the major locus is VGA=- ptbv,2 2q(1 - q)t2[q + (1 - 2q)d]2, and thedominance variance is

VD= Pi(gi -bvi-u)2 [q(l -q)t( -2d)]2. (19)

The covariances between sibs and between parents and offspring yield an estimateof VD [15], which in turn could be used to determine a relation among the param-eters of the major locus by equation (19). In practice, however, VD yields a poorestimate of any of the three major locus parameters. Usually VD is small, andsince its estimate is the difference of two estimated covariances, its sampling vari-ance is large. Moreover, estimation of VD from covariances is confounded with en-vironment common to sibs but not to parent-offspring pairs. In practice the relativevariance components are best estimated by path analysis [2], yielding trial valuesof parameters for segregation analysis.

Estimation of the Liability ThresholdEquation (7) yields several restrictions to which a study must conform if this

model is to be valid. Most important, the relationship of w to x must be linear.This should be tested by preliminary probit analysis of the data. Second, w condi-tional upon x varies only with k; that is, it is independent of genetic factors. Thusit is inherent in this model that knowledge of x is more informative than data onrelatives. This aspect of the model is designed for the situation in which a quanti-tative measure is a relatively good indicator of liability but may be available onlyfor a portion of the pedigree.

Since it is assumed that this relationship supersedes the relationship of liabilityto genetic factors, we can often use the conditional probability of affection to esti-mate the liability noise W and thresholds Z and M from individuals with both xand affection known. This is possible only if the sample measured for both x andaffection status is large enough in each affection category. In the case where no xvalues are measured, W takes the value zero and Z and M are derived from equa-tions (9) and (10). Even when x values are given, if the incidences A and I areaccurately known the investigator may require that the segregation analysis con-form to the restrictions of equations (9) and (10). Since these include the param-eters of major locus and polygenic effect, in this case the affection parameterscannot be estimated separately. The thresholds Z and M must be solved for equa-tions (9) and (10) under each set of trial values of all the parameters with Witerated simultaneously. However, the convenience of separate estimation of Z, M,and W may be more important than exact conformity to affection rates.

For separate estimation in the case of complete ascertainment, assuming no

500

Page 13: of Quantitative Traits

SEGREGATION OF QUANTITATIVE TRAITS

intermediate status, the likelihood of the affection status of an index sibship con-taining s sibs with r affected is

r (Z 8-r: (Z )]x201~~~~=

where i denotes an affected and j a normal individual. Under incomplete selectionthe likelihood becomes

r 8-r

[l~~_(l ~~)r] IIQ [(Z- x )/ij] {- Q[(Z Xi)/VrW]}T J1~~~~~~~~~~~~~~~~~~~~~~J

-I1 I 2VQ[(Z XM)/\/W]}ml (21)

Differentiation gives maximum-likelihood scores of Z and W, which are estimated bysimultaneous Newton-Raphson iteration. If this sample is large enough to givereliable estimates of Z and W, they may be taken as constant in the rest of theanalysis. Similar equations apply to M, Z, and W for a trichotomy. Such tech-niques of partitioning the inference problem allow estimation and testing of thefew parameters germane to a scientific inquiry while the others are held in thebackground.

DISCUSSION

This paper was motivated by the consideration that quantitative traits are moresusceptible to covariance adjustment for age, sex, and environmental effects, andoften more informative in genetic counseling, than are attributes which tend to bedefined less objectively. When an attribute is defined by truncation of a con-tinuous trait or is correlated with such a trait, neglect of quantitative informationmust lead to appreciable loss of power to discriminate genetic hypotheses. In theabsence of clear bimodality (as in PTC testing) or of qualitative differences at themajor locus (as for red cell acid phosphatase), any claim of a major locus basedon analysis which neglects polygenic variation and common environment shouldnot be trusted. Definitive analysis requires both path and segregation approaches.By using data on adopted children, half-sibs, twins, and other relationships, pathanalysis can recognize genetic assortative mating, gene-environment correlation,and environment common to parents and sibs. Segregation analysis of nuclearfamilies reared together is necessarily more limited. We have therefore assumedrandom mating, no gene-environment correlation, and no environment common toparents and children, although environment common to sibs is allowed. These as-sumptions may overestimate polygenic heritability but have no appreciable effecton the reliability or power for discriminating a major locus, which is the primaryobjective of segregation analysis [12]. The same simplifications seem appropriatefor recurrence risks in genetic counseling, where the distinction between commonenvironment and polygenic heritability is also of little consequence.

501

Page 14: of Quantitative Traits

MORTON AND MAC LEAN

We have used a computer program incorporating the methods of this paper intwo ways: (1) Monte Carlo simulation of various cases, some of which violate ourassumptions, to determine power and robustness [12]; and (2) applications to realdata from Brazil [16] to test for major loci acting on race, size, obesity, hemato-crit, and blood pressure. These studies suggest that, with careful attention to in-ternal tests for validation of the underlying assumptions, our methods can be usefulin detecting or excluding major loci for quantitative traits.

SUMMARY

A mixed model of major locus, polygenic variation, and both common and randomenvironment is developed for testing genetic hypotheses about a quantitative traitmeasured in a sample of nuclear families. The major-locus dominance, gene fre-quency, and effect are all estimated from the data, as is the polygenic heritability.In order to treat pedigrees in which not all members are measured for the quanti-tative trait, a threshold model is incorporated to relate affection status to the geneticfactors. Altogether the model contains more than half a dozen parameters whichcan be partitioned by use of sample variances and covariances, separate estimationof thresholds, and other techniques so that the number simultaneously estimatedor tested is within the capability of contemporary computers. The complementaryrelation of path and segregation analysis is discussed.

REFERENCES

1. MORTON NE: Analysis of family resemblance. I. Introduction. Am J Hum Genet26:318-330, 1974

2. RAO DC, MORTON NE, YEE S: Analysis of family resemblance. II. A linear modelfor familial correlation. Am J Hum Genet 26:331-359, 1974

3. WRIGHT S: Correlation and causation. J Agric Res 20:557-585, 19214. FISHER RA: On the probable error of a coefficient of correlation deduced from a

small sample. Metron 1:1-32, 19215. SMITH C: Discrimination between different modes of inheritance in genetic disease.

Clin Genet 2:303-314, 19716. ELSTON RC, STEWART J: A general model for the genetic analysis of pedigree data.

Hum Hered 21:523-542, 19717. MORTON NE: The detection of major genes under additive continuous variation. Am

J Hum Genet 19:23-34, 19678. FALCONER DS: The inheritance of liability to certain diseases, estimated from the

incidence among relatives. Ann Hum Genet 29:51-76, 19659. HOUSEHOLDER AS: Principles of Numerical Analysis. New York, McGraw-Hill, 1953

10. WALD A: Tests of statistical hypotheses concerning several parameters when thenumber of observations is large. Trans Am Math Soc 54:426-488, 1943

11. CHUNG CS, ROBISON OW, MORTON NE: A note on deaf mutism. Ann Hum Genet23:357-366, 1959

12. MACLEAN CJ, MORTON NE: Analysis of family resemblance. IV. Operational charac-teristics of segregation analysis. In preparation

13. MORTON NE: Genetic tests under incomplete ascertainment. Am J Hum Genet 11:1-6, 1959

14. MORTON NE, YEE S, LEW R: Complex segregation analysis. Am J Hum Genet 23:602-611, 1971

502

Page 15: of Quantitative Traits

SEGREGATION OF QUANTITATIVE TRAITS

15. FALCONER DS: Introduction to Quantitative Genetics. Edinburgh, Oliver and Boyd,1960

16. MACLEAN CJ, RAO DC, MORTON NE: Analysis of family resemblance. V. Quantita-tive traits in northeastern Brazil. In preparation

New Genetic Nomenclature for Human Blood Coagulation

The International Committee on Thrombosis and Haemostasis has recently adopteda new system of nomenclature for blood coagulation. Because of the many recentlydiscovered variants of hereditary blood-coagulation factors and precedents estab-lished in the nomenclature of hemoglobin and glucose-6-phosphate dehydrogenase,a new system was established to assure uniformity, clarity, and simplicity. The fullreport has been published (Thromb Diath Haemorrh 30:2-11, 1973) and will alsoappear in the Bulletin of the World Health Organization; an excerpt was publishedin Lancet (2:891-892, 1973).

Papers submitted to the American Journal of Human Genetics should conformto the new system.

503