Analysis of a recent mutation accumulation (MA) experiment has led to the suggestion that as many as one-half of spontaneous mutations in Arabidopsis are advantageous for fitness. We evaluate this in the light of data from other MA experiments, along with molecular evidence, that suggest the vast majority of new mutations are deleterious.
A recent analysis of a mutation-accumulation (MA) experiment in Arabidopsis thaliana led to the conclusion that roughly half of new spontaneous mutations increase reproductive fitness (Shaw et al. 2002). This inference was based on the divergence for two fitness-related traits, number of fruits per plant and number of seeds per fruit, among inbred sublines maintained by selfing with minimal selection for 17 generations. The method used to infer the shape of the distribution of mutational effects, Markov chain Monte Carlo maximum likelihood, represents a welcome advance over previous methods based on maximum likelihood (ML) (Keightley 1994; Keightley and Ohnishi 1998), principally because data recorded over several generations can be analyzed simultaneously. The only previous attempt at multigeneration ML analysis assumed a model with equal mutational effects (Keightley and Bataillon 2000). To accommodate positive or negative mutational effects, Shaw et al. (2002) assumed a one-sided gamma distribution with a displacement parameter that represents a significant departure from the reflected gamma distribution employed in previous studies (Keightley 1994; García-Dorado 1997; Keightley and Ohnishi 1998).
Although uncertainties remain with respect to the form of the mutational-effect distribution (García-Dorado et al. 1999; Keightley and Eyre-Walker 1999; Lynch et al. 1999; Bataillon 2000), a great deal of evidence from several sources strongly suggests that the overall effects of mutations are to reduce fitness. Indirect evidence comes from asymmetrical responses to artificial selection on life history traits, suggesting that variance for these traits is maintained by downwardly skewed distributions of mutational effects (Frankham 1990). More direct evidence comes from spontaneous MA experiments in Drosophila (Fry et al. 1999; Chavarrias et al. 2001), Caenorhabditis elegans (Keightley and Caballero 1997; Vassilieva and Lynch 1999; Vassilieva et al. 2000), wheat (T. Bataillon pers. comm.), yeast (Wloch et al. 2001; Zeyl and DeVisser 2001), Escherichia coli (Kibota and Lynch 1996), and a different MA experiment in Arabidopsis (Schultz et al. 1999). All of these experiments detected downward trends in MA line population mean fitness relative to control populations as generations accrued. As far as we know, there is no case of even a single MA line maintained by bottlenecking that showed significantly higher fitness than its contemporary control populations. Ethyl methanesulfonate (EMS) mutagenesis experiments, in which controls are given identical treatment to mutagenized lines, other than a dose of mutagen, have also shown consistently strongly negative effects on fitness traits in Drosophila (Mukai 1970; Ohnishi 1977; Mitchell and Simmons 1977; Temin 1978; Keightley and Ohnishi 1998; Yang et al. 2001) and C. elegans (Keightley et al. 2000). Similarly, transposable element insertional mutagenesis leads to reduced fitness in Drosophila (Mackay et al. 1992) and E. coli (Elena and Lenski 1997).
Although the above studies have focused on the fitness effects of mutations in the context of laboratory environments, substantial indirect evidence derived from molecular studies supports the contention that most mutations in natural populations are deleterious (e.g., Ohta 1995; Li 1997; Fay et al. 2001). The fraction of amino-acid altering mutations that is deleterious enough to be removed by selection is approximated by C = 1 − Kn/Ks, where Kn and Ks are the substitution rates at nonsynonomous and synonymous sites, respectively. If mutations are neutral on average, C, the proportion of “missing” amino-acid substitutions, would have an expected value of 0.0. However, in all taxa examined so far, average values of C are in excess of 0.7 (e.g., Ohta 1995; Eyre-Walker et al. 2002), implying that the majority of amino-acid altering mutations are deleterious. There is nothing obviously unusual with respect to A. thaliana in this regard. Wright et al. (2002) and S. Wright (pers. comm.) have recently investigated constraint in the protein-coding genes of two species of Arabidopsis, A. lyrata (an outcrosser) and A. thaliana (a natural inbreeder), using an outgroup to infer lineage-specific constraint. Estimates for C are 0.88 in both species, despite their different systems of mating; C is likely to underestimate the fraction of amino-acid mutations that are deleterious due to fixation of advantageous amino-acid mutations and purifying selection acting at synonymous sites (Eyre-Walker et al. 2002).
How can the above observations be reconciled with Shaw et al.'s (2002) conclusion that 50% of mutations are advantageous for fitness in A. thaliana? First, it is possible that the traits examined by Shaw et al. have intermediate optima and are under stabilizing selection, so are not genuine major fitness components. Such an effect has been observed in C. elegans, in which effects of EMS mutagenesis on productivity late in life are approximately symmetrically distributed, whereas effects on early productivity, a trait more closely related to population growth rate, are strongly skewed downwards (Keightley et al. 2000). Similar effects could also explain the apparently bidirectional effects of EMS mutagenesis on fruit number in A. thaliana (Camara and Pigliucci 1999).
Second, the length of the Shaw et al. experiment (17 generations) may have been insufficient to reveal a significant change in mean phenotype, given the level of replication in the MA and control populations. Consider the MA experiment in C. elegans carried out over 60 generations by Keightley and Caballero (1997). For lifetime reproductive output, the difference in control and MA-line means was nonsignificant at generation 32, although there was significant among-MA-line variance. By generation 60, the among-MA-line variance was somewhat greater, but the change in overall mean was still nonsignificant. However, the downward skew of the MA-line means in generation 60 provided a strong signature of the effect of MA: seven MA lines had a mean productivity of more than two control standard deviations (SD) below the control mean, whereas the best performing MA line had mean productivity of only 1.5 SD above the control mean. The 214 generation experiment of Vassilieva et al. (2000) clarifies the slowly emerging pattern—as the period of MA extended, progressively lower fitness classes accumulated, whereas the frequencies of the highest fitness classes observed in the controls progressively diminished. Contrary to the suggestions of Shaw et al. (2002), there is no evidence for improved fitness in any periodically bottlenecked C. elegans line.
A difficulty with the analysis in Shaw et al. (2002) is the absence of an evaluation of alternative models for the distribution of mutational effects. Indeed, from results reported in their Table 2, it appears that a constant-effects model cannot be rejected for at least one of the two traits (fruit number). Moreover, it was not possible to compare the fit of models with different proportions of positive and negative mutations, because the parameter specifying the location of the mutational-effect distribution had to be fixed prior to the application of MCMCML to avoid “numerical instability.” The latter behavior suggests that the model is overparameterized, and that there is insufficient information in the data from this MA experiment to estimate the proportion of advantageous mutations, which seems to be corroborated by the fact that the standard errors of Vm estimates in this study are about half of the estimates themselves (Shaw et al. 2000).
Although Shaw et al. (2002, p. 460) imply that previous analyses have failed to allow for bidirectional mutations, methods to estimate the proportion of beneficial mutations in MA experiments have been developed and applied (Keightley 1994; Keightley and Caballero 1997; García-Dorado 1997; Keightley and Ohnishi 1998). For example, by fitting a reflected gamma distribution model to the data from a C. elegans MA experiment, the ML estimate for the proportion of positive mutations was found to be zero, with an upper limit of 28% (Keightley and Caballero 1997). These earlier approaches assumed a reflected gamma distribution of mutational effects, in preference to the “displaced gamma” preferred by Shaw et al. (2002). Whereas the reflected gamma distribution has a potential limitation in that it can generate a singularity for mutations with effects near zero, a shifted gamma distribution has the undesirable effect of truncating the distribution of mutations below a lower limit. No model can be expected to incorporate all of the subtleties of biological phenomena, but well-established statistical methodology exists for evaluating the justification for alternative models.
In summary, the vast majority of mutations are deleterious. This is one of the most well-established principles of evolutionary genetics, supported by both molecular and quantitative-genetic data. This provides an explanation for many key genetic properties of natural and laboratory populations.