Abstract – The primary goal of the field of experimental phylogenetics is to generate branching histories of biological entities in the laboratory for use in testing methods of phylogenetic reconstruction. Here, I explore possible reasons why this field has remained small, despite hints of a bright future 15 years ago. Specifically, I examine three primary arguments that researchers have used to motivate the field of experimental evolution. The first involves claims that hypotheses in phylogenetics and molecular evolution are difficult to unambiguously falsify, and therefore an experimental approach is required. I argue that these claims do not specifically motivate experimental phylogenetics because they are based on an incorrect interpretation of the philosophy of historical science, and they do not differentiate between experimental evolution and its competitor, computer simulation. A related argument is that experimental phylogenetics can be used to understand the strengths and limitations of various methods of historical inference. This is a valid argument, but again does not distinguish between experimental evolution and computer simulation. In fact, I argue that high replication under different conditions is most important for testing methods, putting a premium on speed and leading to a disadvantage of experimental phylogenetics compared to computer simulation. A third argument does compare experimental phylogenetics to computer simulation, claiming that experimental evolution has increased realism compared to computer simulation. For example, experimental phylogenies may present modes of evolution not often implemented by computer simulations, such as common parallel or generally convergent evolution. These arguments do not decrease the value of completed experimental phylogenetic studies, but call for caution when weighing the costs of future studies that generate phylogenies in the lab.
Already as an undergraduate, I had an inordinate fondness for phylogenetic trees, and few papers sparked my imagination more than one announcing the birth of experimental phylogenetics (Hillis et al. 1992). In that paper, Hillis and colleagues generated experimentally a phylogeny of viruses and used it to compare various phylogenetic methods. For the first time, researchers had at their disposal a phylogeny of “living” organisms generated in the lab for the express purpose of studying phylogenetic methods. This known phylogeny came at a time when the enterprise of testing phylogenetic methods was in its heyday. Even popular culture was enamored with the ability to simulate life, as the Maxis software company released their enormously popular video game SimLife in the same year. In 1992, I expected experimental studies to be a wave of the future in phylogenetics.
Sometimes crystal balls can be foggy. Despite the enthusiasm of a decade and a half ago, the field of experimental phylogenetics remains very small (see also Forde and Jessup this volume). Was my enthusiasm misplaced? Here, I will discuss what I believe to be the reasons why the field has barely grown since its inception 15 years ago. Specifically, I will critique three primary arguments used to justify experimental phylogenetics. Most importantly, I conclude that experimental phylogenetics is an overly expensive simulation procedure. Even if experimental phylogenies have more biological realism than computer simulations, this realism comes at the considerable expense of decreased speed and potential for replication. This inherent trade-off between speed and biological realism is a recurring theme in experimental phylogenetics studies. Although an explicit understanding of the trade-off does not diminish the value of several previous studies, it may provide a guiding principle for those contemplating future contributions to experimental phylogenetics.
Motivation 1 – The perceived inferiority of historical science
One motivation in the literature for experimental phylogenetics has been a perceived inferiority of historical science, compared to experimental science. Here, I argue that there is no philosophical support for the claim that historical science is inferior to experimental science, thus negating one possible motivation for experimental evolution. Even though negating one motivation does not alone negate the entire rationale for experimental evolution, it is nevertheless important to promote a clearer understanding of historical science.
To some authors, experimental phylogenetics is a motivated by the self-consciousness of historical scientists in the face of experimental science. We learn from an early age that “real” science relies on the possibility of unambiguously falsifying hypotheses. Yet specific events that happened in the past – like the phylogenetic branching of mammals – can never be recreated. Like the legal system of the United States, historical science relies on demonstrating “beyond a reasonable doubt” that particular events did or did not occur. In science, reconstructing past events often takes the form of statistical/probability statements. Additionally, verifying specific historical occurrences may rely on various signatures left by historical events, such as the presence of a crater, high levels of iridium, and absence of previously prevalent fossils all dating to 65 million years ago, which congruently support the historical hypothesis of mass extinction by extraterrestrial impact. Although philosophers of science argue for the efficacy of such historical inference (Cleland 2001), there is still widespread perception of its inferiority.
This inferiority complex that burdens historical scientists is evident in the writing of Bull et al. (1993), illustrating it as a motivator for the field of experimental phylogenetics:
From a cold and cruel perspective of the scientific method, the major weakness of this field is its difficulty in unambiguously falsifying hypotheses of phylogenetic relationships, and hence, of molecular evolution.
Here, the authors are stating that “the scientific method” – which I take to mean Popperian falsificationism – is the preferred way to perform science. A difficulty in falsifying historical hypotheses is seen by the authors as a major liability for phylogenetics and molecular evolution studies. If only we could actually test historical hypotheses through experimentation – the logic goes – this liability would be lessened. This attitude seems pervasive. For example, Nature editor Henry Gee (Gee 1999) wrote that historical hypotheses “can never be tested by experiment, and so they are unscientific… No science can ever be historical.” Yet another author, Skell (2005) wrote “much of the evidence that might have established the theory [referring to “Darwin’s theory of evolution”] on an unshakable empirical foundation, however, remains lost in the distant past.” That article makes many errors, especially the conflation of and unvalidated value judgments on historical and experimental scientific studies. Skell’s article also naively equates all of evolution with a few “Just so Stories” about natural selection, and ignores many practical applications of evolutionary theory; including gene function prediction and measures of biodiversity, to name just two of many. Unfortunately, that article was written by a member of the National Academy of Sciences, thereby suggesting scientific credibility on the issue, and has been highlighted by the anti-evolution religious organization, the Discovery Institute.
Despite common perception, this inferiority complex for historical science is unwarranted for at least two reasons (Cleland 2001). First – despite what we learn in introductory science classes – there are problems with strict falsificationism. For example, probability statements are not falsifiable, yet they are still scientific because they are testable, indicating that a better theory of testability than falsificationism is required (Sober 2007). Furthermore, strict falsificationism is rarely followed, even by practicing experimental scientists. The reason is that, in any experiment, numerous variables are not controlled by the investigator. Even the seemingly simplest experiments do not control many potential variables, such as sun flares, humidity, season, etc, because it is usually safe to assume that many variables do not affect the experiment at hand. As a result, the possibility always remains that an unsupported hypothesis is not supported because of one of these ancillary assumptions, even if the original hypothesis is true. Therefore, experimental scientists often examine these ancillary assumptions to show that they are responsible for the failure of the hypothesis at hand. For example, I remember many hypotheses about physical laws that were not supported by my experiments in Introductory Physics Lab. Rather than falsifying established laws of physics, I invoked the failure of ancillary assumptions, such as “this ancient and abused student balance produces reliable data.”
A second reason to reject claims of inferiority for historical science, regardless of the status of falsificationism, is that historical hypotheses that explain observable phenomena provide predictions to be tested, and are therefore scientific. In practice, these predictions often act as confirmatory hypotheses; historical scientists seek to demonstrate a “smoking gun” – strong evidence for a specific event (Cleland 2001). As an example, Darwin’s historical hypothesis that all living organisms derive from a common ancestor has left numerous traces consistent with that hypothesis, including the use of RNA and DNA by all organisms, shared use of the same subset of all possible amino acids, and a nearly universal genetic code (for more detailed discussion of the hypothesis and difficulties in testing it see Sober and Steel 2002). This “smoking gun” perspective is not necessarily falsificationist, yet it is clearly scientific by presenting testable hypotheses.
Another way that historical scientists work is to test ancillary assumptions of historical models. For example, Darwin hypothesized that natural selection gradually built complex eyes from simple precursors. This model assumes that functional intermediates exist at all stages between simple and complex eyes. Darwin (1859), and later Salvini-Plawen and Mayr (1977), provided support for this model by describing the functioning eyes of living animals at numerous stages of complexity. In addition, Nilsson and Pelger (1994) found strong support for another ancillary assumption of the natural selection hypothesis – that there has been sufficient time for gradual selection to build eyes of observed complexity. It is true that we cannot recreate the evolution of the human eye. Nevertheless, we can make models of how eye evolution proceeded and test the ancillary assumptions of that model. Clearly, historical inference is scientific and – while philosophically different than experimental science – should not be construed as inferior. Therefore, a perceived inferiority should not be used as a motivation for experimental phylogenetics.
Thus far, I have only negated one argument (the perceived inferiority of historical science) for experimental phylogenetics, and as such have not yet provided any arguments against it, or for any alternative approach. The next two sections make explicit comparisons between experimental phylogenetics and the alternative approach of computer simulation. Before considering whether experimental phylogenetics allows for increased biological realism over computer simulation, I will consider the value of experimental phylogenetics for testing methods of phylogenetic inference.
Motivation 2- Testing phylogenetic methods
Although claims for the inferiority of historical science do not have a sound philosophical basis, another motivation for experimental phylogenetics appears philosophically sound. Specifically, understanding the relative strengths and weaknesses of methods of inference is an important scientific endeavor, and experimental phylogenies can be used to attain these goals. However, simply realizing that experimental phylogenetics can be of use is not sufficient, because other approaches can be used to the same end. Therefore, a convincing argument for conducting experimental phylogenetics must provide justification over and above other possible approaches.
Computer simulation, statistical analysis, and congruence all can be used to assess the performance of phylogenetic methods (Hillis 1995). While a full review of methods and philosophies for testing phylogenetic methods is beyond the scope of this chapter, and they have been reviewed elsewhere (e.g. Grant 2002; Hillis 1995), I conclude here that generating biological phylogenies is an overly expensive enterprise, costing a prohibitively large amount of investigator time compared to computer simulation. Speed can be increased in specific situations, but perhaps at the expense of biological realism. The question then becomes whether increased biological realism overcomes the increased cost over computer simulation. I will argue that it does not, concurring with others who have pointed out that experimental phylogenetics is subject to the same constraints as simulations: in either situation, it is necessary to assume the evolutionary processes present in the tests apply universally (Grant 2002; Sober 1993). This assumption is especially true when trying to establish the efficacy of methods, as opposed to the shortcomings. Any one replicate history can call into question the reliability of a method, but because any single replicate could be non-general, establishing reliability of methods requires generating replicates under many different assumptions or parameter values.
The need for speed: Costs and creative solutions
The goal of experimental phylogenetics is to generate clades of organisms (or genes or historical documents) with a known history and to examine the performance of methods for reconstructing that known history. Perhaps the most compelling advantage (discussed in detail below as motivation 3) of experimental phylogenetics over computer simulations comes down to the possibility for increased biological realism. As Hillis et al. (1993) wrote:“The point of the experimental approach is to avoid approximating biological evolution by examining actual cases of biological evolution.”
To be practical, experimental phylogenetics requires the ability to generate clades on a timescale of months or less, which in turn requires using systems with brief generation times and rapid rates of evolution. Obtaining such rapid rates of evolution restricts the set of organisms that can be utilized. This is the first cost of the need for speed: a reliance on assumption that rapidly replicating biological systems faithfully model other systems, including those that evolve on long time scales. Even some of the most rapidly evolving systems have been further modified to increase their rate of evolution, leading to additional departures from natural biological systems. For example, the mutagen N-methyl-N’-nitrosoguanidine (NG) was added to increase the mutation rate of viruses in experimental phylogenetics (Hillis et al. 1992). The mutagen increases mutation rate, but also changes the mutational profile, causing G->A or C->T changes to be most common (Bull et al. 1993). Here again, the altered mutational profile may be considered a deviation from biological realism that is a necessary byproduct of increasing the speed of evolution.
As necessity is often the mother of invention, the demonstrated need for speed in experimental phylogenetics inspired some creative solutions. For example, Cunningham et al. (1997) and Cunningham et al. (1998) produced a modular experimental phylogeny, which could be analyzed in multiple ways. Starting from a wild-type T7 bacteriophage, they evolved six separate lineages, each of which was bifurcated once. As a result, they were able to assemble multiple different four-taxon phylogenies with varying relative branch lengths, from a single original experiment (Cunningham et al. 1998). This highlights one major difference between testing methods of phylogenetic tree inference and methods of ancestral state reconstruction. Any phylogeny has multiple nodes, such that ancestral state reconstruction methods can be examined on each of them. For ancestral states, there is an automatic replication. For testing phylogenetic trees, and for testing correlations between characters (correlative comparative methods: review in Garland et al. 2005), it may always be wise for the experiment to be modular, to allow for increased replication from the expensive experiment.
Another ingenious compromise between the need for speed in simulation studies and “biological realism” is hypermutagenic polymerase chain reaction (PCR). Instead of using living organisms or viruses, researchers have generated experimental phylogenies by utilizing the mutagenic properties inherent in copying DNA. By winnowing the evolving biological system to DNA and polymerase, the researchers have greatly increased the speed at which replicates can be generated. For example, Vartanian et al. (2001) copied a dihydrofolate reductase gene of Escherichia coli into a phylogeny of 124 “pseudogenes.” Sanson et al. (2002) used similar methodology to generate sequence data (over 2200 bp each) for an experimental phylogeny with 15 ancestor and 16 terminal sequences. However, just as in viral phylogenies, the increased speed in PCR-generated phylogenies comes at the expense of biological realism. In the PCR experiments, the biological system is reduced to an enzyme and DNA. The complexities of mutation and selection in the face of changing environments are greatly simplified in a PCR system compared to nature.
A third creative solution to the trade-off between speed and biological reality was parametric bootstrapping. Parametric bootstrapping involves estimating parameters of a model from real data, and using those parameter estimates and model to simulate multiple datasets (Efron 1985; Felsenstein 1988). Bull et al. (1993) estimated parameters for restriction site evolution from a bacteriophage experimental phylogeny. Using these parameters, they simulated by computer the evolution of multiple datasets to test methods of phylogeny reconstruction and molecular evolutionary inferences. Some may argue that this parametric bootstrap procedure provides a balance between biological realism and speed. Parameters are estimated from a biological system and speed is gained by simulating multiple replicates by computer. However, the parameters of molecular evolution do not have to be estimated using experimental phylogenetic data; any comparative data set could be used to infer model parameters. Furthermore, if experiments on model selection are any guide, then model parameters might be well estimated even if the true phylogeny is not known precisely. That is, in simulation experiments, the specific starting tree had little effect on the models of molecular evolution chosen as statistically best-fit (Posada and Buckley 2004; Posada and Crandall 2001), suggesting that the same might hold for parameter estimates of those models. In summary, parametric bootstrapping is a valuable tool that can extend the results gained from experimental phylogenetics (Bull et al. 1993). However, I remain unconvinced that experimental phylogenetic data are more valuable for parameter estimation than are comparative data from any naturally evolving system.
Motivation 3- Increased Biological Realism
Perhaps the most plausible justification for the use of experimental phylogenetics relates to arguments that it provides increased biological realism. Unlike the previous arguments I discussed, this one is based on an explicit comparison between experimental phylogenetics and computer simulation. If experimental phylogenetics really does add increased biological realism over computer simulation, then this would be a powerful argument for the approach.
What is biological realism?
Experimental evolutionists take biological realism to mean elements that contribute to an evolving system that are not decided a priori by the investigator (see also Huey and Rosenzweig this volume). I will refer to this as the degree of specification. In a computer simulation, usually the only factor that is not specified by the investigator is one or more sequences of random numbers. Of course, these random numbers can be used to specify many elements of a simulation, such as timing of branching events, or rates of evolution. In experimental evolution, many elements are also specified, for example the branching pattern of the phylogeny (Hillis et al. 1992). However, some aspects of experimental are not specified by the investigator, such as the mutational process and the relationship between mutations and a phenotype like virus replication rate (Oakley and Cunningham 2000). The claim of proponents of the field is that these non-specified elements increase biological realism over computer simulation.My own biological reality
The above claims for increased realism may be difficult to assess with generality because they involve comparing a real-world system to a mathematical statistical model. We must decide, then, how well the models used in computer simulation account for real-world evolution. The models used in simulation, and the real-world trajectory of evolutionary history are so varied, it is difficult to know where to begin when attempting such a comparison. Nevertheless, this perspective suggests that the value of experimental phylogenies might be increased over computer simulations if experimental approaches are more likely to present the researcher with situations that are not explicitly modeled, but are produced by the non-specified aspects evolutionary process itself.Such a situation occurred in my only foray into experimental phylogenetics. I was using the bacteriophage phylogeny generated by Hillis et al (1992) to study methods of ancestral state reconstruction for phenotypic traits (Oakley and Cunningham 2000). I found that virulence evolved in a way I didn’t expect a priori – there were large amounts of homoplasy. Systematists often assume that characters should usually evolve phylogenetically, such that close relatives share traits that are more similar than distant relatives. This is the inherent assumption behind methods like independent contrasts (Felsenstein 1985; Garland et al. 2005), and it is an assumption that is often tested now (e.g.Abouheif 1999; Blomberg et al. 2003). However, simulated data are often neutral. In real-world systems, homoplasy may be very common, driven by structural and functional demands on organisms (reviewed in Conway Morris 2003).
In the case of the bacteriophage phylogeny, instead of close relatives being more similar in virulence characteristics than distant relatives, the character was highly convergent. I observed parallel decreases in virulence in all the experimental lineages, which was rapid enough to erase all phylogenetic signal of the character. For example, a non-phylogenetic model of character evolution (Lee and Yin 1996; Mooers and Schluter 1998; Mooers et al. 1999; Oakley et al. 2005) is the best-fit among nine Brownian-motion based models. Had I used neutral computer simulations exclusively in testing ancestral state reconstruction methods, I might not have modeled the evolutionary trajectory actually taken by the viruses. Here, the viruses might have provided more biological realism than computer simulation in that the biological system is arguably less specified than a computer simulation.
One counter argument to this discussion of the enhanced biological realism of experimental studies is that a wholly empirical system arrived at very similar conclusions to my study of ancestral virulence in bacteriophage: Webster and Purvis (2002) investigated extinct and living foraminifera and found that strong directional change in body size erased phylogenetic signal for this character. If an empirical system showed the same results, then perhaps an experimental system was not needed to find the results. Yet, appropriate fully empirical systems may be rare, and may have higher costs than even experimental evolution in investigator time spent understanding the system.
Summary
Despite enthusiasm in the early 1990’s for a future of experimental phylogenetics, the field has stalled and produced very few papers and few novel insights. Part of this explanation is that phylogenetic methodologies have become rather standardized tools for evolutionary inference. However, as I argued above, two other considerations point to fundamental flaws in the foundations of the field. First, historical science is not inferior to experimental science. Historical and experimental sciences are philosophically different, and historical science is not inferior or less scientific. Therefore, the perceived inferiority of historical science cannot be used to justify any experimental approach in science, including experimental phylogenetics. Second, I argued that experimental phylogenies are probably not inherently more valuable than any other "simulation," and they are vastly more expensive in terms of investigator time and resources. As such, experimental phylogenetic studies that are already conducted are no less valuable than any simulation study, but researchers contemplating new experimental phylogenetics should carefully weigh the costs. One possible saving grace for experimental phylogenetics is the possibility that computer simulations are highly specified, such that experimental approaches might be more likely to produce unanticipated but biologically realistic results (see also Swallow et al. this volume on one important value of replication in selection experiments -- the possibility of finding "multiple solutions"). This is a difficult proposition to argue for or against quantitatively, but certainly highlights the requirement that simulations must be based on as much biological knowledge as possible, which might limit generality and/or increase the cost of performing them. I hasten to point out that the critique presented here does not apply to experimental evolution in general, which can still serve as a valid demonstration of evolutionary processes. However, my own foray into experimental phylogenetics left me unsatisfied, and this paper presents the reasons why.
References
Abouheif, E. 1999. A method for testing the assumption of phylogenetic independence in comparative data. Evolutionary Ecology Research 1:895-909.
Blomberg, S. P., T. Garland, Jr., and A. R. Ives. 2003. Testing for phylogenetic signal in comparative data: behavioral traits are more labile. Evolution Int J Org Evolution 57:717-745.
Bull, J. J., C. W. Cunningham, I. J. Molineux, M. R. Badgett, and D. M. Hillis. 1993. Experimental molecular evolution of bacteriophage T7. Systematic Biology 47:993-1007.
Cleland, C. 2001. Historical science, experimental science, and the scientific method. Geology 29:987-990.
Conway Morris, S. 2003, Life's Solution: Inevitable humans in a lonely universe. Cambridge University Press, Cambridge.
Cunningham, C. W., K. Jeng, J. Husti, M. Badgett, I. J. Molineux, D. M. Hillis, and J. J. Bull. 1997. Parallel molecular evolution of deletions and nonsense mutations in bateriophage T7. Molecular Biology and Evolution 14:113-116.
Cunningham, C. W., H. Zhu, and D. M. Hillis. 1998. Best-fit maximum likelihood models for phylogenetic inference: Empirical tests with known phylogenies. Evolution 52:978-987.
Darwin, C. 1859, On the origin of the species by means of natural selection, or, The preservation of favoured races in the struggle for life. London, John Murray ...
Efron, B. 1985. Bootstrap confidence intervals for a class of parametric problems. Biometrika 72:45-58.
Felsenstein, J. 1985. Phylogenies and the comparative method. American Naturalist 125:1-15.
—. 1988. Phylogenies from molecular sequences: inferences and reliability. Annual Review of Genetics 22:521-565.
Garland, T., Jr., A. F. Bennett, and E. L. Rezende. 2005. Phylogenetic approaches in comparative physiology. Journal of Experimental Biology 208:3015-3035.
Gee, H. 1999, In search of deep time: Beyond the fossil record to a new history of life. New York, The Free Press.
Grant, T. 2002. Testing methods: The evaluation of discovery operations in evolutionary biology 18:94-111.
Hillis, D. M. 1995. Approaches for assessing phylogenetic accuracy. Syst. Biol. 44:3-16.
Hillis, D. M., J. J. Bull, W. M.E., M. R. Badgett, and I. J. Molineux. 1993. Experimental approaches to phylogenetic analysis. Evolution 42:90-92.
Hillis, D. M., J. J. Bull, M. E. White, M. R. Badgett, and I. J. Molineux. 1992. Experimental phylogenetics: generation of a known phylogeny. Science 255:589-592.
Lee, Y., and J. Yin. 1996. Detection of evolving viruses. Nature Biotechnology 14:491-493.
Mooers, A. Ø., and D. Schluter. 1998. Fitting macroevolutionary models to phylogenies: an example using vertebrate body sizes. Contributions to Zoology 68:3-18.
Mooers, A. Ø., S. M. Vamosi, and D. Schluter. 1999. Using phylogenies to test macroevolutionary hypotheses of trait evolution in Cranes (Gruinae). American Naturalist 154:249-259.
Nilsson, D. E., and S. Pelger. 1994. A pessimistic estimate of the time required for an eye to evolve. Philisophical Transactions of the Royal Society of London B 256:53-58.
Oakley, T. H., and C. W. Cunningham. 2000. Independent contrasts succeed where ancestor reconstruction fails in a known bacteriophage phylogeny. Evolution 54:397-405.
Oakley, T. H., Z. Gu, E. Abouheif, N. H. Patel, and W. H. Li. 2005. Comparative Methods for the Analysis of Gene-Expression Evolution: An Example Using Yeast Functional Genomic Data. Mol Biol Evol 22:40-50.
Posada, D., and T. Buckley. 2004. Model selection and model averaging in phylogenetics: advantages of Akaike information criterion and Bayesian approaches over likelihood ratio tests. Systematic Biology 53:793-808.
Posada, D., and K. A. Crandall. 2001. Selecting the best-fit model of nucleotide substitution. Systematic Biology 50:580-601.
Salvini-Plawen, L. V., and E. Mayr. 1977, On the evolution of photoreceptors and eyes: Evolutionary Biology, v. 10. New York, Plenum Press.
Sanson, G. F., S. Y. Kawashita, A. Brunstein, and M. R. Briones. 2002. Experimental phylogeny of neutrally evolving DNA sequences generated by a bifurcate series of nested polymerase chain reactions. Mol Biol Evol 19:170-178.
Skell, P. S. 2005. Evolutionary theory contributes little to experimental biology. The Scientist 19:10.
Sober, E. 1993. Experimental Tests of Phylogenetic Inference Methods 42:85-89.
—. 2007. What is wrong with intelligent design? The Quarterly Review of Biology 82:3-8.
Sober, E., and M. Steel. 2002. Testing the hypothesis of common ancestry. J Theor Biol 218:395-408.
Vartanian, J. P., M. Henry, and S. Wain-Hobson. 2001. Simulating pseudogene evolution in vitro: determining the true number of mutations in a lineage. Proc Natl Acad Sci U S A 98:13172-13176.
Webster, A. J., and A. Purvis. 2002. Testing the accuracy of methods for reconstructing ancestral states of continuous characters. Proc R Soc Lond B Biol Sci 269:143-149.
Blomberg, S. P., T. Garland, Jr., and A. R. Ives. 2003. Testing for phylogenetic signal in comparative data: behavioral traits are more labile. Evolution Int J Org Evolution 57:717-745.
Bull, J. J., C. W. Cunningham, I. J. Molineux, M. R. Badgett, and D. M. Hillis. 1993. Experimental molecular evolution of bacteriophage T7. Systematic Biology 47:993-1007.
Cleland, C. 2001. Historical science, experimental science, and the scientific method. Geology 29:987-990.
Conway Morris, S. 2003, Life's Solution: Inevitable humans in a lonely universe. Cambridge University Press, Cambridge.
Cunningham, C. W., K. Jeng, J. Husti, M. Badgett, I. J. Molineux, D. M. Hillis, and J. J. Bull. 1997. Parallel molecular evolution of deletions and nonsense mutations in bateriophage T7. Molecular Biology and Evolution 14:113-116.
Cunningham, C. W., H. Zhu, and D. M. Hillis. 1998. Best-fit maximum likelihood models for phylogenetic inference: Empirical tests with known phylogenies. Evolution 52:978-987.
Darwin, C. 1859, On the origin of the species by means of natural selection, or, The preservation of favoured races in the struggle for life. London, John Murray ...
Efron, B. 1985. Bootstrap confidence intervals for a class of parametric problems. Biometrika 72:45-58.
Felsenstein, J. 1985. Phylogenies and the comparative method. American Naturalist 125:1-15.
—. 1988. Phylogenies from molecular sequences: inferences and reliability. Annual Review of Genetics 22:521-565.
Garland, T., Jr., A. F. Bennett, and E. L. Rezende. 2005. Phylogenetic approaches in comparative physiology. Journal of Experimental Biology 208:3015-3035.
Gee, H. 1999, In search of deep time: Beyond the fossil record to a new history of life. New York, The Free Press.
Grant, T. 2002. Testing methods: The evaluation of discovery operations in evolutionary biology 18:94-111.
Hillis, D. M. 1995. Approaches for assessing phylogenetic accuracy. Syst. Biol. 44:3-16.
Hillis, D. M., J. J. Bull, W. M.E., M. R. Badgett, and I. J. Molineux. 1993. Experimental approaches to phylogenetic analysis. Evolution 42:90-92.
Hillis, D. M., J. J. Bull, M. E. White, M. R. Badgett, and I. J. Molineux. 1992. Experimental phylogenetics: generation of a known phylogeny. Science 255:589-592.
Lee, Y., and J. Yin. 1996. Detection of evolving viruses. Nature Biotechnology 14:491-493.
Mooers, A. Ø., and D. Schluter. 1998. Fitting macroevolutionary models to phylogenies: an example using vertebrate body sizes. Contributions to Zoology 68:3-18.
Mooers, A. Ø., S. M. Vamosi, and D. Schluter. 1999. Using phylogenies to test macroevolutionary hypotheses of trait evolution in Cranes (Gruinae). American Naturalist 154:249-259.
Nilsson, D. E., and S. Pelger. 1994. A pessimistic estimate of the time required for an eye to evolve. Philisophical Transactions of the Royal Society of London B 256:53-58.
Oakley, T. H., and C. W. Cunningham. 2000. Independent contrasts succeed where ancestor reconstruction fails in a known bacteriophage phylogeny. Evolution 54:397-405.
Oakley, T. H., Z. Gu, E. Abouheif, N. H. Patel, and W. H. Li. 2005. Comparative Methods for the Analysis of Gene-Expression Evolution: An Example Using Yeast Functional Genomic Data. Mol Biol Evol 22:40-50.
Posada, D., and T. Buckley. 2004. Model selection and model averaging in phylogenetics: advantages of Akaike information criterion and Bayesian approaches over likelihood ratio tests. Systematic Biology 53:793-808.
Posada, D., and K. A. Crandall. 2001. Selecting the best-fit model of nucleotide substitution. Systematic Biology 50:580-601.
Salvini-Plawen, L. V., and E. Mayr. 1977, On the evolution of photoreceptors and eyes: Evolutionary Biology, v. 10. New York, Plenum Press.
Sanson, G. F., S. Y. Kawashita, A. Brunstein, and M. R. Briones. 2002. Experimental phylogeny of neutrally evolving DNA sequences generated by a bifurcate series of nested polymerase chain reactions. Mol Biol Evol 19:170-178.
Skell, P. S. 2005. Evolutionary theory contributes little to experimental biology. The Scientist 19:10.
Sober, E. 1993. Experimental Tests of Phylogenetic Inference Methods 42:85-89.
—. 2007. What is wrong with intelligent design? The Quarterly Review of Biology 82:3-8.
Sober, E., and M. Steel. 2002. Testing the hypothesis of common ancestry. J Theor Biol 218:395-408.
Vartanian, J. P., M. Henry, and S. Wain-Hobson. 2001. Simulating pseudogene evolution in vitro: determining the true number of mutations in a lineage. Proc Natl Acad Sci U S A 98:13172-13176.
Webster, A. J., and A. Purvis. 2002. Testing the accuracy of methods for reconstructing ancestral states of continuous characters. Proc R Soc Lond B Biol Sci 269:143-149.