7
Copyright 0 1997 by the Genetics Society of America Comparing Patterns of Nucleotide Substitution Rates Among Chloroplast Loci Using the Relative Ratio Test Spencer V. Muse * and Brandon S. Gaut * Institute of Molecular Evolutionary Genetics, Department of Biology, The Pennsylvania State University, University Park, Pennsylvania 16802 and tDepartment of Plant Sciences and Center for Theoretical and Applied Genetics, Rutgers University, New Brunswick, New Jersey 08903 Manuscript received October 8, 1996 Accepted for publication January 21, 1997 ABSTRACT Even when several genetic loci are used in molecular evolutionary studies, each locus is typically analyzed independently of the others. This type of approach makes it difficult to study mechanisms and processes that affect multiple genes. In this work we develop a statistical approach for the joint analysis of two or more loci. The tests we propose examine whether or not nucleotide substitution rates across evolutionary lineages have the same relative proportions at two loci. Theses procedures are applied to 33 genes from the chloroplast genomes of rice, tobacco, pine, and liverwort. With the exception of five clearly distinct loci, we find that synonymous substitution rates tend to change proportionally across genes. We interpret these results to be consistent with a “lineage effect” acting on the entire chloroplast genome. In contrast,nonsynonymous rates do not change proportionally across genes, suggesting that locus-specific evolutionary effects dominate patterns of nonsynonymous substitution. T HE molecular clock hypothesis, which posits that molecular evolutionary rates remain constant over time, was introduced over three decades ago ( MARGOLI- ASH 1963; ZUCKERKANDL and PAULING 1965). Years of often heated debate have now established that the mo- lecular clock is not a universal phenomenon at the DNA level. Variation in nucleotide substitution rates has been been found in animals ( e.g., MARTIN et al. 1992; EASTEAL et al. 1995; LI et al. 1996), plants (e.g., WOLFE et al. 1987; GAUT et al. 1992; MUSE and GAUT 1994; NICKRENT and STARR 1994), and bacteria (e.g., MORAN 1996). Although a universal, time-calibrated molecular clock does not exist, there may be molecular clocks that are calibrated by factors other than time. For example, if a change in generation time produces a systematic in- crease in the number of germ cell divisions per unit time, a corresponding increase in nucleotide substitu- tion rate will result. These types of effects are predicted by the neutral theory of molecular evolution (KIMURA 1983), and their existence is supported by diverse em- pirical evidence. Aside from generation time, authors have suggested a number of other mechanisms that might affect molecular clocks, including metabolic rate (MARTIN and PALUMBI 1993), population size (MOM 1996), polymerase fidelity ( WU and LI 1985; BRITTEN 1986), and speciation rate (BOUQUET et al. 1992 ) . All of the above factors are similar in that they pro- duce “lineage effects:” the resulting changes in substi- tution rates should be observed at all loci in a particular Corresponding author: Spencer V. Muse, 206 Tucker Hall, Division of Biological Sciences, University of Missouri, Columbia, MO 65211. E-mail: [email protected] Genetics 146 393-399 (May, 1997) evolutionary lineage. For example, a doubling of gener- ation time should produce a halving of substitution rate at all loci. [This oversimplification is suitable for the present purposes, but see OHTA ( 1992,1993) .] Lineage effects are to be contrasted with “locus effects,” which are specific to individual genetic loci. It is important to distinguish between two types of locus effects. The simplest are those that merely govern the rate of evolu- tion relative to other loci. As an example, BULMER et al. ( 1991 ) speculated that regional differences in muta- tion rates were responsible for observed heterogeneity in synonymous substitution rates among loci. Thus, the first type of locus effects are responsible for interlocus rate differences that remainfairly constant across evolu- tionary lineages. The second class of locus effects is composed of factors that modify the substitution rate at a single locus in a particular lineage or set of lineages. Occasional intervals of locus-specific positive or nega- tive selection can create this type of effect. These effects might also be thought of as locus X lineage interactions. It is the latter form of locus effects that will be discussed in this work, and unless stated otherwise the term locus effect will imply the second type of effect. There is clear evidence for lineage effects in the plant chloroplast genome. The tobacco plastid genome evolves slowly relative to that of the pea ( WOLFE et al. 1987). In addition, the chloroplast genomes of both maize and rice have been shown to evolve more rapidly than their tobacco homologues ( GAUT et al. 1993; MUSE and GAUT 1994) , possibly because of a substitution rate acceleration in grass lineages ( GAUT et al. 1992) . How- ever, there is some evidence to suggest that relative rates between .grasses and other lineages are not conserved

Comparing Patterns of Nucleotide Substitution Rates Among

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Copyright 0 1997 by the Genetics Society of America

Comparing Patterns of Nucleotide Substitution Rates Among Chloroplast Loci Using the Relative Ratio Test

Spencer V. Muse * and Brandon S. Gaut

* Institute of Molecular Evolutionary Genetics, Department of Biology, The Pennsylvania State University, University Park, Pennsylvania 16802 and tDepartment of Plant Sciences and Center for Theoretical and Applied Genetics,

Rutgers University, New Brunswick, New Jersey 08903

Manuscript received October 8, 1996 Accepted for publication January 21, 1997

ABSTRACT Even when several genetic loci are used in molecular evolutionary studies, each locus is typically

analyzed independently of the others. This type of approach makes it difficult to study mechanisms and processes that affect multiple genes. In this work we develop a statistical approach for the joint analysis of two or more loci. The tests we propose examine whether or not nucleotide substitution rates across evolutionary lineages have the same relative proportions at two loci. Theses procedures are applied to 33 genes from the chloroplast genomes of rice, tobacco, pine, and liverwort. With the exception of five clearly distinct loci, we find that synonymous substitution rates tend to change proportionally across genes. We interpret these results to be consistent with a “lineage effect” acting on the entire chloroplast genome. In contrast, nonsynonymous rates do not change proportionally across genes, suggesting that locus-specific evolutionary effects dominate patterns of nonsynonymous substitution.

T HE molecular clock hypothesis, which posits that molecular evolutionary rates remain constant over

time, was introduced over three decades ago ( MARGOLI-

ASH 1963; ZUCKERKANDL and PAULING 1965). Years of often heated debate have now established that the mo- lecular clock is not a universal phenomenon at the DNA level. Variation in nucleotide substitution rates has been been found in animals ( e.g., MARTIN et al. 1992; EASTEAL et al. 1995; LI et al. 1996), plants (e.g., WOLFE et al. 1987; GAUT et al. 1992; MUSE and GAUT 1994; NICKRENT and STARR 1994), and bacteria (e.g., MORAN 1996). Although a universal, time-calibrated molecular clock does not exist, there may be molecular clocks that are calibrated by factors other than time. For example, if a change in generation time produces a systematic in- crease in the number of germ cell divisions per unit time, a corresponding increase in nucleotide substitu- tion rate will result. These types of effects are predicted by the neutral theory of molecular evolution (KIMURA

1983), and their existence is supported by diverse em- pirical evidence. Aside from generation time, authors have suggested a number of other mechanisms that might affect molecular clocks, including metabolic rate (MARTIN and PALUMBI 1993), population size (MOM 1996), polymerase fidelity ( WU and LI 1985; BRITTEN 1986), and speciation rate (BOUQUET et al. 1992 ) .

All of the above factors are similar in that they pro- duce “lineage effects:” the resulting changes in substi- tution rates should be observed at all loci in a particular

Corresponding author: Spencer V. Muse, 206 Tucker Hall, Division of Biological Sciences, University of Missouri, Columbia, MO 65211. E-mail: [email protected]

Genetics 146 393-399 (May, 1997)

evolutionary lineage. For example, a doubling of gener- ation time should produce a halving of substitution rate at all loci. [This oversimplification is suitable for the present purposes, but see OHTA ( 1992,1993) .] Lineage effects are to be contrasted with “locus effects,” which are specific to individual genetic loci. It is important to distinguish between two types of locus effects. The simplest are those that merely govern the rate of evolu- tion relative to other loci. As an example, BULMER et al. ( 1991 ) speculated that regional differences in muta- tion rates were responsible for observed heterogeneity in synonymous substitution rates among loci. Thus, the first type of locus effects are responsible for interlocus rate differences that remain fairly constant across evolu- tionary lineages. The second class of locus effects is composed of factors that modify the substitution rate at a single locus in a particular lineage or set of lineages. Occasional intervals of locus-specific positive or nega- tive selection can create this type of effect. These effects might also be thought of as locus X lineage interactions. It is the latter form of locus effects that will be discussed in this work, and unless stated otherwise the term locus effect will imply the second type of effect.

There is clear evidence for lineage effects in the plant chloroplast genome. The tobacco plastid genome evolves slowly relative to that of the pea ( WOLFE et al. 1987). In addition, the chloroplast genomes of both maize and rice have been shown to evolve more rapidly than their tobacco homologues ( GAUT et al. 1993; MUSE and GAUT 1994) , possibly because of a substitution rate acceleration in grass lineages ( GAUT et al. 1992) . How- ever, there is some evidence to suggest that relative rates between .grasses and other lineages are not conserved

394 S. V. Muse and B. S. Gaut

across genes, leading to the possibility that locus effects have supplemented lineage effects in the chloroplast genome (GAUT et al. 1993; MUSE and GAUT 1994). Whether or not extensive rate variation among evolu- tionary lineages, such as that at the plastid rbcL locus ( BOUSQUET et al. 1992; GAUT et al. 1992), is a result of locus or lineage effects remains to be seen.

Empirical evidence supports the existence of both locus and lineage effects in the chloroplast genome. However, the interplay between locus and lineage ef- fects has not been examined rigorously. In this work we develop a statistical framework for separating the two types of effects. The test that emerges poses the question "Are relative substitution rates among lin- eages well conserved across genes, or is it necessary to invoke locus-specific factors to explain the observed rates?" Tests are developed to address synonymous and nonsynonymous rates separately. We apply these proce- dures to 33 genes from rice, tobacco, liverwort, and pine, asking whether or not a single lineage effect is sufficient to explain the observed rate heterogeneity among lineages in the chloroplast genome. We con- clude that variation in synonymous rates is, for the most part, consistent with lineage effects, perhaps generation time. Conversely, relative nonsynonymous rates are not well conserved among lineages, suggesting that these rates are driven largely by locus effects, perhaps natural selection.

MATERIALS AND METHODS

Sequence data: The data come from the complete chloro- plast sequences of rice ( HIRATSUKA et al. 1989), tobacco ( SHI- NOZAKI et al. 1986), black pine (WAKASUGI et al. 1994), and the liverwort Marchantia polymmpha ( O W A M A et al. 1986). We used protein-coding genes of at least 200 bp. Sequence alignments are available by anonymous ftp at sid.biosci. missouri.edu in the directory pub / datasets/ cpDNA.

Relative rate tests: The relative rate tests of MUSE and GAUT (1994) were used to compare nucleotide substitution rates between evolutionary lineages at individual loci. These are likelihood ratio tests, and the underlying model incorporates properties of the genetic code to allow separate comparisons of synonymous and nonsynonymous rates. The tests use an outgroup sequence, 0, to test the null hypothesis that synony- mous (o r nonsynonymous) substitutions have accumulated at equal rates along the lineages leading to sequences A and B. If we denote synonymous rates as (Y and nonsynonymous rates as P, the following null hypotheses can be tested:

HsyrI: ( Y A = ( Y B ,

H,,,,: PA = PH. When several loci are available, we can incorporate rate

information from all of them to investigate potential changes in rate common to the entire group. Application of a sign test has proven to be useful for this purpose ( WU and LI 1985; GAUT et al. 1993). For each locus, one of the two lineages has a higher rate estimate arising from the relative rate test procedure. Under the null hypothesis of equal rates between the two lineages for all loci (but possibly different rates among loci), each lineage has an equal chance of having the higher

Locus A tree

A. 3 G-1 4 1

"_"""""""""""""-

Locus A tree

3---"1

".

Locus B tree

1*

4-

Locus B tree

1 7 2a 3 4 1 4-

FIGURE 1.-Comparing relative branch lengths between loci. In both the top and bottom panels the trees for locus A are identical. In A the two loci have trees of different absolute lengths, but notice that the relative lengths of branches are the same for the two loci. In B the relative length of branches are different between the two loci. The scenario in A is consis- tent with a single force acting on both loci simultaneously.

rate estimate. It follows that the number of times lineage A is estimated to be faster when a group of L loci are examined follows a Binomial ( L , 0.5) distribution, and a sign test pro- vides a method for testing the null hypothesis of no group wide change in rates.

Relative ratio tests: In this work we introduce a new ap- proach for studying substitution rates when data are available from the same set of taxa for two or more genes (or regions of genes). The purpose of the test is to determine whether or not the relative lengths of branches are conserved across loci in a given set of taxa. The approach has been dubbed the "relative ratio test," and the idea is illustrated in Figure 1. In Figure 1A the two loci have different absolute rates (locus A is evolving faster than locus B ) , but the relative proportions of branches on the two trees are the same. The rate heterogeneity between evolutionary lineages in Figure 1A is consistent with a single factor ( a lineage effect) acting on both loci simultaneously. For instance, the changes in substitution rates at both loci might coincide with changes in generation time or metabolic rate. Note that the presence of different levels of absolute constraint is not a complicating factor; two loci can maintain identical relative proportions while evolving at very different absolute rates. Contrast this with Figure lB, where the rates between loci appear to be uncorrelated. The locations of fast and slow branches are not shared between the two genes, nor are the relative lengths of different branches. The heterogeneity illustrated in Figure 1B requires that separate forces (locus effects) be invoked at the two loci.

To describe the test formally, first assume a simple evolu- tionary model where the substitution rate for locus A on branch i is described by a single parameter, , L L ~ , ~ (see Figure 2 ) . Under the null hypothesis of proportionality described above, the corresponding parameters for locus B are , L L B , ~ = K,LL*,, . Unequal absolute rates between loci are accommodated by the proportionality constant K . When K is equal to 1, the two loci have identical rates. Allowing K to deviate from unity maintains the same branch length ratios between loci, but allows each locus to evolve at its own characteristic rate. Under the alternative hypothesis each branch on the tree for locus B has its own unique rate, , L L ~ , ~ . The extension to more complex nucleotide substitution models is straightforward. We again use the model of MUSE and GAUT ( 1994), and develop sepa-

Chloroplast Substitution Rates 395

Locus A tree

\ 1

U

HA

Locus B tree

FIGURE 2.-Parameterizing the relative ratio test. The pa- rameters involved under both the null and alternative hypoth- eses are shown. In this figure the value of K illustrated is < 1 , resulting in a “shorter” tree for locus B under Ho. The test places no constraints on K , and it is not necessary to specify whether it is greater than or less than unity (see text for details).

rate tests for synonymous and nonsynonymous rates. In this case, each branch has separate synonymous ( a ) and nonsyn- onymous ( p ) rate parameters, and the following null hypoth- eses can be tested using likelihood ratio statistics:

Hc,,: a8,r = ~ , a ~ , ~ for all branches i,

IT:<,”: ,OB. , = K , P ~ , $ for all branches i.

The superscript P is intended to distinguish tests of propor- tionality from relative rate tests. For the model of H&, applied to a tree with N branches, there are a total of 3N + 1 rate parameters: N nonsynonymous rates for each locus A and B , N synonymous rates for locus A , and the proportionality parameter K , that determines the synonymous rates at locus B. It is important to note that the precise value of K , is esti- mated from the data and need not be specified in advance. By symmetry, there are also 3N + 1 rate parameters for H:”,,. For each of the null hypotheses, the appropriate model for the alternative hypothesis is one includin no constraints on the parameters at either locus. Thus, H A conslsts of 4N rate parameters, N synonymous and N nonsynonymous rates for each locus. Using the above values, each of the likelihood ratio tests is seen to have N - 1 d.f. Rejection of one of the null hypotheses implies that the true substitution rates have different relative proportions between the two loci, and that locus effects are necessary to explain the observed rates. One can view the relative ratio test as comparing the “shape” of the trees among different loci, asking whether or not the tree at one locus is a shrunken (or enlarged) version of the other.

B ’

RESULTS

Both synonymous and nonsynonymous rates are ac- celerated in the grass lineage: We have previously docu- mented a chloroplast-wide acceleration of both synony- mous and nonsynonymous rates in the grass family ( GAUT et al. 1993; MUSE and GAUT 1994). Since that time, the entire chloroplast sequence of black pine (Pz- nus thunbergii) has been completed (WAKASUGI et al. 1994). These new data let us extend our previous rela- tive rate analyses of the chloroplast genome. In Table

1 we show the results of relative rate tests (MUSE and GAUT 1994) using 33 loci from across the chloroplast genome. There are very few rejections involving synony- mous rates, but many nonsynonymous rate tests using rice lead to rejection of rate homogeneity. In Table 1 we also indicate for each locus which of the two lineages was estimated to have evolved faster. Sign tests allow us to test for chloroplast-wide accelerations in rate, and those P values are shown at the bottom of the table. Notice that rice is found to have evolved faster than both tobacco and pine at almost every locus. This holds true whether we consider synonymous ( P values 0.005, 0.035) or nonsynonymous rates ( P values 0.0003, 0.005 ) . Base frequencies are surprisingly similar among the four taxa, so compositional bias cannot explain the observed differences (results not shown). There is no evidence of rate heterogeneity between tobacco and pine loci.

Comparing the pattern of substitution rates among loci: The relative rate tests of the previous section serve primarily to reinforce our earlier observations citing a chloroplast-wide acceleration in the grass lineage ( GAUT et al. 1993; MUSE and GAUT 1994). However, as is the case for most observations of rate heterogeneity, little has been done to identify the source of the rate differences. The generation time effect has been impli- cated as a potential source of the grass acceleration at the plastid rbcL locus ( GAUT et al. 1992). While the chloroplast data analyzed here will not lead to any de- finitive answers about specific mechanisms, they are suf- ficient to determine whether or not the observed rate heterogeneity between lineages can be explained by a single force, which may or may not be generation time, operating uniformly across all loci. In other words, we attempt to determine if the observed rate heterogeneity is consistent with the action of a lineage effect, or if it is necessary to invoke multiple locus effects to explain the data.

Using the model of MUSE and GAUT ( 1994), we have applied the relative ratio test to all pairs of loci. Results are shown in Figure 3. Test results above the diagonal are those for the test on nonsynonymous rates, H:,,,,, requiring only nonsynonymous rates to be propor- tional; synonymous rates are unconstrained. The null hypothesis of proportionality is rejected for almost half of the pairwise comparisons, with no discernible pat- tern. While some loci (e.g., rpoB) lead to rejection for most contrasts, several (e.g., @ l 4 ) reject for very few. Taken in total, the nonsynonymous results indicate that relative ratios of nonsynonymous rates are not main- tained across chloroplast loci. This observation suggests that the level of selective constraint on amino acid se- quences varies considerably between genes across lin- eages (see DISCUSSION) .

The results below the diagonal are based on synony- mous rates only. Whereas the nonsynonymous results were notable for their widespread rejection of propor-

396 S. V. Muse and B. S. Gaut

TABLE 1

Relative rate tests results

Synonymous Nonsynonymous

Locus Length Region R-T-P R-P-M T-P-M R-T-P R-P-M T-P-M

atpA atpB atpE atpF atpZ petA petB petD PsaA psaB psbB psbC psbD rbcL

~ $ 1 4 $1 6

rp122 rp12? rpoA

rp12

rp120

rpoB rp0Cl

rpsll rpsl2

rpoC2

rps14 q s l 9

r p d rps4 rps 7 rps8 P value

rps2

1485 1473 396 396 741 957 642 519

2250 2199 1500 1416 1056 1395 41 1 369 438 342 255 270 897

31 62 1701 1350 435 231 306 279 702 753 603 465 396

LSC LSC LSC LSC LSC LSC LSC LSC LSC LSC LSC LSC LSC LSC LSC LSC LSC LSC LSC IR LSC LSC LSC LSC LSC IR LSC IR/LSC LSC LSC LSC IR LSC

R R R R R R T R R R T R R* * R R T T R T T R R R R R T* R R R R R R T

0.005

R P P R P R P P R R R R R R P P P R R R P R R R R R R R R R R P R

0.035

T P T T T T P T T T T T T P P T P T T T P T T P T T P P T P P P P

0.296

R*** R R** R R R T R R* R* * T R R T R* * R T R* R*** T R** R** R*** R** R*** R* R R R R R** R*** T 0.0003

R** R R R R* R* P R* P R P P P P P R R R R*** R* R* * R R*** R* * R* * R P R** R** R R R* * R* 0.005

P T P T T T T P T P P P P T P P P T** P P T* T T T* P* P T T T P T T T

0.728

R, T, and P in the table entries indicate whether rice, tobacco, or pine was estimated to have evolved more rapidly. *, **, and *** indicate that the appropriate relative rate test corresponding to that cell was significant at the 0.05, 0.01, or 0.001 level, respectively. For example, when comparing nonsynonymous rates between rice and tobacco at the atpA locus using pine as an outgroup, we find that rice was estimated to have evolved more rapidly, and that the result is significant at the 0.001 level. The null hypothesis that the two lineages evolved at equal rates can be tested using a sign test, and the Pvalues for this test are on the final row. The use of Marchantia polymorphs as an outgroup is indicated with an M.

tionality, the results from synonymous rates are striking in their consistency with proportionality. The pattern of rejections is easily seen. Five loci (rp12, @23, @l2, $1~19, and rps7) reject proportionality in almost every case. These loci account for virtually all of the signifi- cant test results. In fact, only 25 of the 147 significant tests do not involve one of these five loci. These results suggest that a single lineage effect might be sufficient to explain most, but clearly not all, of the heterogeneity in synonymous rates. Of course, it is important to keep in mind that the absence of significant tests may simply be an artifact of low statistical power. A power analysis would help to distinguish between these two possibilit- ies. Unfortunately, computational burden prevents this

from being feasible. In addition, the breadth of the alternative hypothesis makes it difficult to design a par- ticularly useful power study. Likelihood ratio tests typi- cally have desirable power characteristics, and the fact that strong signals are found at five loci helps to reduce the concern that lack of power is a major factor in these analyses.

DISCUSSION

The work described here consists of two main compo- nents. The first component is the development of a framework for studying the evolutionary rate dynamics of two or more loci. Studies of molecular evolutionary

Chloroplast Substitution Rates 397

arpA afpB afpE atpF alp1

perA petB pefD psaA psaB psbB psbC psbD rbcL rp12

rp114 rp116 VI20 VI22 rp123 rpoA P B

rpoCl rpoC2 rpsl I rpsl2 rpsl4 rpsl9 rps2 rp.73 rpsl

rps7 rps8

Significance Level: : 0.001 ~0.01 m:O.O5 0 : not significant

FIGURE 3.-Results of chloroplast loci relative ratio tests. The results above the diagonal are those from the nonsynonymous relative ratio test, HCon. Below the diagonal are the results of the synonymous relative ratio test, H Z . The loci used for the pairwise comparisons are indicated on the axes.

rates have traditionally been onedimensional: while (MUSE et al. 1997). The possibility of studying the co- several loci are often used in a study, the sequence evolution of genes (such as those in a host-parasite data for each locus are usually analyzed individually. A system) at the molecular level is also an exciting possi- “single locus” approach to studying evolutionary rates bility. is often limited because many interesting phenomena, The second major portion of this work is the use of including generation time, polymerase fidelity, and two-locus procedures to examine the patterns of substi- metabolic rate, are expected to leave their signature tution rate heterogeneity in the chloroplast genome effects across multiple loci ( CLECC et al. 1994). We with a finer level of detail than had been achieved in believe that the relative ratio tests described here will be previous studies. The results are quite intriguing. The a useful tool for studying these multi-locus phenomena. acceleration of synonymous rates in the grass lineage The relative ratio test will also have other applications. documented here (Table 1 ) and in earlier work ( GAUT For example, the test was used recently to identify re- et al. 1993; MUSE and GAUT 1994) is observed at most gions of unique function in the spectrin gene family genes, and therefore is relatively consistent with the

398 S. V. Muse and B. S. Gaut

action of a single lineage effect. With the exception of five loci, synonymous relative ratio tests generally fail to reject the null hypothesis of proportional rates between loci. This lends credence to the possibility that a single lineage effect was responsible for most of the accelera- tion of synonymous rates in the grasses.

The results for nonsynonymous rates stand in marked contrast to those for synonymous rates. While there is clearly a genome-wide increase in nonsynonymous substitution rate in the grass lineage (Table 1 ) , rejec- tion of the nonsynonymous relative ratio test is a ubiqui- tous property of painvise comparisons of chloroplast loci, implying that a single lineage effect cannot explain the rate increase. These results suggest that synonymous and nonsynonymous substitution rates have very differ- ent dynamics. On a superficial level this is not surpris- ing, since it has long been known that synonymous sub- stitutions often accumulate more rapidly than do non- synonymous substitutions. The relative ratio analysis extends our understanding of the differences between these two processes beyond the simple notion that the two types of changes occur with different frequencies. Rejection of a relative ratio test implies that the nonsyn- onymous substitution rates along lineages at one locus have different relative proportions than those at the second locus. On the basis of the relative rate and rela- tive ratio test results one can infer both that nonsynony- mous rates must be changing, perhaps frequently, in different evolutionary lineages, and that the location and magnitude of these changes are not shared among genes. In other words, nonsynonymous substitution rates must commonly be in a state of flux. This observa- tion belies the presumption that nonsynonymous rates are merely reduced at some constant level relative to synonymous rates.

This conjecture is not without empirical support out- side of this paper. Evolutionary rates were recently com- pared between grasses and palms at three loci repre- senting each of the three plant genomes ( GAUT et al. 1996; EYRE-WALKER and GAUT 1997). Synonymous rates were found to vary quite substantially between grasses and palms; at all three loci, grass sequences evolved much faster than their palm homologues. Furthermore, the proportional difference between grass and palm rates is quite similar across all three loci. As is the case with the chloroplast loci described here, nonsynony- mous rates at the three loci do not follow such a simple pattern. At the chloroplast-encoded rbcL locus, nonsyn- onymous rates in grasses are faster than those in palms. The two families have similar rates at the nuclear Adh locus, while the palms are found to have higher nonsyn- onymous rates than grasses at the mitochondrial atpA locus. Clearly, the proportionality found among synony- mous rates at these genes does not hold for nonsynony- mous rates.

Any unified explanation of chloroplast molecular evolution must now account for the following features:

( 1 ) both synonymous and nonsynonymous substitution rates vary across evolutionary lineages, ( 2 ) with few ex- ceptions (see below) the relative magnitude of synony- mous rate changes is quite homogeneous across loci, and ( 3 ) the relative magnitudes of nonsynonymous rate changes vary extensively across loci. These observations are consistent with a scenario under which purifylng selection blinks on and off in different genes in differ- ent lineages, reminiscent of the notion of an “episodic clock” ( GILLESPIE 1986). This selective component is then coupled with lineage effects (perhaps generation time in the case of the grasses) operating on both syn- onymous and nonsynonymous rates. While details of the explanation can be disputed, the results presented here soundly reject the possibility of a simple, constant relationship between synonymous and nonsynonymous substitution rates.

The genes that reject proportionality of synonymous rates merit discussion. In Figure 3 there are five genes that differ markedly from the others: rpZ2, rpl23, rpsl2, rpsl9, and rps7. It is striking to note that all of these genes are located on the inverted repeat (IR) region of the chloroplast. In fact, no other IR genes are in- cluded in the analysis; the remaining genes represent the long (LSC) and short (SSC) single-copy regions. If we were to choose five loci at random from the 33 sampled for this study, the probability of choosing the five IR genes is only 0.000004 (using the hypergeomet- ric distribution). This result suggests that something curious is happening in the IR region of the chloroplast genome. Such a unique behavior for nonsynonymous rates is not apparent at these loci. However, it is difficult to identify “exceptional” loci with regard to nonsynony- mous rates (Figure 3 ) .

The IR region of the genome is known to evolve more slowly than other regions of the chloroplast (WOLFE et nl. 1987; CI.EGG et al. 1991 ) . It is possible that this is the result of a poorly understood mechanism for main- taining almost complete identity between the two copies of the IR (PALMER 1985). Yet another interesting fea- ture of the IR region is that the size of the region varies over evolutionary time, and as a consequence genes can move in and out of the IR. It is expected that these fluctuations affect genic substitution rates. For exam- ple, a gene that moves from the IR into single copy regions of the genome is expected to undergo an accel- eration in nucleotide substitution rate. As it turns out, the pine chloroplast genome has a much reduced IR region relative to rice, tobacco and Marchantia ( WAKA- SUGI et al. 1994). The IR in pine contains only a single tRNA gene, with single copies of most other IR genes now present in the SSC and LSC regions. As a result of the virtual loss of the IR region in pine, all of the IR genes in our study have fluctuated over evolutionary time between the IR and the single copy regions of the genome. In addition, the rpsl9 gene, which straddles the IR/LSC junction in most plants, has moved com-

Chloroplast Substitution Rates 399

pletely inside the IR in grasses ( GOULDING et al. 1996). Because of these fluctuations, we expect the IR genes in our study to vary in substitution rate over evolution- ary time and have different rate dynamics relative to genes that have resided only in single copy regions. The results of relative ratio tests provide evidence to support these expectations.

It is interesting to note that relative rate tests (Table 1) do not provide any evidence that rate dynamics of IR genes are unique. This suggests that the relative ratio test may have more power to detect some types of rate heterogeneity. This may be a reasonable assumption based on the simple fact that the relative ratio test uses more data and thus has greater statistical power. How- ever, it also reflects that the relative ratio test is designed to detect a different sort of rate heterogeneity than the relative rate test. While relative rate tests simply com- pare the absolute rates between two evolutionary lin- eages at a single locus, the relative ratio test compares patterns of evolutionary rates among all lineages at two or more loci. The combined use of the two tests pro- vides a thorough description of the pattern of substitu- tion rates at a collection of genetic loci.

We thank ADAM EYRE-WAI.KF,R for helpful comments on this manu- script and for coining the name "relative ratio test." This work has been supported by an National Institute on Aging postdoctoral fellow- ship to S.V.M. and an Alfred P. Sloan Foundation fellowship to B.S.G.

LITERATURE CITED

BOUSQUF,T,J., S . H. STRAUSS,A. DOERKESEN and R. PRICE, 1992 Exten- sive variation in evolutionary rate of rbcL gene sequences among seed planw Proc. Natl. Acad. Sci. USA 89: 7844-7848.

BRI'ITEN, R. J., 1986 Rates of DNA sequence evolution differ be- tween taxonomic groups. Science 231: 1393-1398.

BUI.MER, M., K. H. WOLFE and P. M. SHARP, 1991 Synonymous nu- cleotide substitution rates in mammalian genes: implications for the molecular clock and the relationship of mammalian orders. Proc. Natl. Acad. Sci. USA 88: 5974-5978.

CI.EGC;, M. T., G. H. LEARN and E. M. GOLENBERG, 1991 Molecular evolution of chloroplast DNA, pp. 135-149 in Euolution at the Molecular Level, edited by R. K. SELANDER, A. G. CIARK and T. S . WHITI'AM. Sinauer Associates, Sunderland, MA.

<:LEG(;, M. T., B. S. GAUT, G. H. LEARN and B. R. MORTON, 1994 Rates and patterns of chloroplast DNA evolution. Proc. Natl. Acad. Sci. USA 91: 6795-6801.

EASTFAI., S., C. COI.I.ET and D. BETTY, 1995 The Mammalian Molecular Clock. R. G. Landes Company, Austin, TX.

EYRE-WALKER, A,, and B. S . GAUT, 1997 Correlated rates of evolution across plant genomes. Mol. Biol. Evol. (in press).

GACT, B. S . , S . V. MUSE, W. D. CIARK and M. T. CLEGG, 1992 Rela- tive rates of nucleotide substitution at the r6cL locus of monocot- yledonous plants. J. Mol. Evol. 35: 292-303.

GAUT, B. S . , S . V. MUSE and M. T. CLEGG, 1993 Relative rates of nucleotide substitution in the chloroplast genome. Mol. Phylo. Evol. 2: 89-96.

GAUT, B. S., B. R. MORTON, B. MCCAIGandM. T. CI.EGG, 1996 Substi- tution rate comparisons between grasses and palms: synonymous rate differences at the nuclear gene Adh parallel rate differences at the plastid gene rbcl. Proc. Natl. Acad. Sci. USA 93: 10274- 10279.

GILLESPIE, J. H., 1986 Natural selection and the molecular clock. Mol. Biol. Evol. 3: 138-155.

GOULDING, S. , R. OLMSTEAD, C. MORDEN and K. WOLFE, 1996 Ebb and flow of the chloroplast inverted repeat. Mol. Gen. Genet.

HIRATSUKA, J., H. SHIMADA, R. WHITTIER, T. ISHIBASHI, M. SAKAMOTO et al., 1989 The complete sequence of the rice olyza sativa chloroplast genome: intermolecular recombination between dis- tinct tRNA genes accounts for a major plastid DNA inversion during the evolution of the cereals. Mol. Gen. Genet. 217: 18.5- 194.

KIMURA, M., 1983 The Neutral The09 of Molecular Evolution. Cam- bridge University Press, Cambridge, England.

L I , W.-H., D. L. EI.I.SWORTH, J. KRUSHKAI., B. H.;J. CHANC and D. HEWETT-EMMET, 1996 Rates of nucleotide substitution in pri- mates and rodents and the generation-time effect hypothesis. Mol. Phylo. Evol. 5: 182-187.

MARGOIJASH, E., 1963 Primary structure and evolution of cyto- chrome c. Proc. Natl. Acad. Sci. USA 50: 672-679.

MARTIN, A. P., and S . R. PAIXMBI, 1993 Body size, metabolic rate, generation time and the molecular clock. Proc. Natl. Acad. Sci. USA 90: 4087-4091.

MARTIN, A. P., G. J. P. NAYLOR and S . R. PAI.UMRI, 1992 Rates of mitochondrial DNA evolution in sharks are slow compared with mammals. Nature 357: 153-155.

MORAN, N. A,, 1996 Accelerated evolution and Muller's rachet in endosymbiotic bacteria. Proc. Natl. Acad. Sci. USA 93: 2873- 2878.

MLJSE, S. V., and B. S . GAUT, 1994 A likelihood approach for com- paring synonymous and nonsynonymous nucleotide substitution rates with application to the chloroplast genome. Mol. Biol. Evol. 11: 715-724.

MUSE, S . V., A. G. CIARK and G. H. THOMAS, 1997 Comparisons of the nucleotide substitution process among repetitive segments of the a- and P-spectrin genes. J. Mol. Evol. (in press).

NICKRENT, D. L., and E. M. ST-, 1994 High rates of nucleotide substitution in nuclear small-subunit ( 18s) rDNAfrom holopara- sitic flowering plants. J. Mol. Evol. 39: 62-70.

OHTA, T., 1992 The nearly neutral theory of molecular evolution. Ann. Rev. Syst. Ecol. 2 3 263-286.

OHTA, T., 1993 An examination of the generation-time effect on molecular evolution. Proc. Natl. Acad. Sci. USA 90: 10676- 10680.

O H ~ M A , K., H. FUKLZAWA, T. KOHCHI, H. SHIRAI, T. SANO rt al., 1986 Chloroplast gene organization deduced from complete sequence of liverwort mnrchantia polymorphs chloroplast DNA. Nature 322: 572-574.

PALMER, J. D., 1985 Evolution of chloroplast and mitochondrial DNA in plants and algae, pp. 131 -240 in Molecular Euolutiona? Genetics, edited by R. J. MACINTYRE. Plenum Press, New York.

SHINOZAKI, K., M. OHME, M. TANAKA, T. WAKASUGI, N. HAYMHIDA et al., 1986 The complete sequence of the tobacco chloroplast genome: its gene organization and expression. EMBO J. 5: 2043- 2049.

WAKASUGI, T., J. TSUDZUKI, S . TO, K. N. TSUDZUKI, T. TSUDZUKI PL

al., 1994 Loss of all ndh genes as determined by sequencing the entire chloroplast genome of the black pine pinus thunbqti. Proc. Natl. Acad. Sci. USA 91: 9794-9798.

WOLFE, K. H., W.-H. LI and P. M. SHARP, 1987 Rates of nucleotide substitution vary greatly among plant mitochondrial, chloroplast, and nuclear DNAs. Proc. Natl. Acad. Sci. USA 84: 9054-9058.

WLT, C.-I., and W.-H. LI, 1985 Evidence for higher rates of nucleotide substitution in rodents than in man. Proc. Natl. Acad. Sci. USA 82: 1741 - 1745.

ZUCKERKANDI., E., and L. PAUI.INC, 1965 Evolutionary divergence and convergence in proteins, pp. 97-116 in Ermluing Genes and Protrins, edited by B. BRYSON and H. J. VOC:CI.. Academic Press, New York.

252 195-206.

Communicating editor: G. B. GOIDINC