Our lab develops methods that broaden our understanding of the evolutionary processes responsible for generating the patterns we observe in the tree of life. Specifically, we develop novel computational and statistical methods for estimating evolutionary parameters in a phylogenetic context. We evaluate the properties and performance of phylogenetic methods using biological data sets and through the development of new simulation tools. Additionally, with collaborators, we apply phylogenetic techniques to empirical data sets to understand patterns of biodiversity, morphological evolution, and historical biogeography.
Historical observations in the form of fossil occurrence times or other geological data are fundamental components necessary for inferring the absolute timing of speciation events. The common approach to time calibration taken by most neontologists is to assign minimum age estimates, based on fossil specimens, to nodes within their group of interest. However, we can rarely assign fossils to nodes of a tree without error, and much of the information associated with fossil taxa is lost when we try to represent the fossil record as a single time estimate applied to a single internal node.
In collaboration with Tanja Stadler (ETH Zürich) and John Huelsenbeck (UC–Berkeley), I developed a unified extant and extinct- species diversification model that eliminates the need for the ad hoc prior densities that are common practice in Bayesian phylogenetics. The ‘fossilized birth-death’ (FBD) process is a model for calibrating divergence-time estimates in a Bayesian framework and explicitly acknowledges that extant species and fossils are representatives of the same macroevolutionary process. Under this model, we can estimate internal node ages conditional on a tree of extant taxa and set of fossil occurrence times. We developed novel reversible-jump MCMC methods to marginalize over realizations of this process and implemented them in a recent version of DPPDiv. This model improves upon standard approaches for calibrating phylogenies with fossil data by allowing for inclusion of all available fossils and providing coherent measures of statistical uncertainty. (See Heath et al., PNAS 2014)
In collboration with Dr. John Nason (Iowa State University), Dr. E. Allen Herre (Smithsonian Tropical Research Institute, Panama), Dr. Charlotte Jandér (Harvard University), Dr. Carlos Machado (University of Maryland), and Dr. Robert Raguso (Cornell University), we are investigating the coevolutionary history of species interactions in Central American figs and their pollinating (mutualistic) and non-pollinating (antagonistic) fig wasps. Figs and their fig wasp pollinators and parasites have co-evolved for ~90 million years to become both highly diverse (>750 species of figs) and ecologically important “keystone” components of tropical forest ecosystems. Figs and wasps have long been assumed to represent a case of strict co-speciation, with highly specific pollinator and parasitic (non-pollinator) wasps identifying appropriate hosts via distinctive volatile chemical signals. More recent studies suggest a more complex scenario, however, involving an evolutionary history punctuated by host-shifts by individual wasp species. Although the wasp associations with fig hosts have been widely studied, the genetic consequences for the host figs of host-shifting pollinators and the mechanisms underlying host recognition remain poorly understood.
This project will fill these gaps by producing robust, detailed, many-gene phylogenies for 14 strangling fig (Ficus) species and their associated pollinating (Pegoscapus) and non-pollinating (Idarnes) fig wasps (~60 species) from the vicinity of Barro Colorado Island, Panama. Using transcriptome sequences, we will target ~300 genes from each of three species per lineage for capture and subsequent Illumina sequencing. Phylogenies will be inferred using Bayesian methods and will enable robust testing of phylogenetic congruence between figs and fig wasps. Further, they will guide population-level genotype by sequencing to test a priori predictions of potential cases of hybridization in the figs and host shifting and race formation in both pollinator and non-pollinator wasps. Combined with quantification of wasp-attracting fig volatiles and fruit-surface chemicals, this work will detect and resolve the genomic consequences of host introgression due to host-shifting pollinator wasps, and link them to the chemical basis of host-recognition.
This research will significantly clarify both the patterns and processes underlying the evolutionary ecology of fig and fig wasp interactions. Our standardized, genomic approach is essential for: 1) obtaining robust fig and fig wasp species trees, 2) delimiting fig species and discriminating cases of introgressive gene flow from shared ancestral polymorphism, and 3) linking introgression of figs and their chemical phenotypes to cases of pollinator host shifting. Our community-level approach is also essential to obtain the across-species replication necessary for robust statistical inference of diversification pattern and process across interacting fig and wasp taxa.
Rates of Molecular Evolution — Accurate estimates of species divergence times are vital to answering questions about historical biogeography, estimating diversification rates, and identifying the causes of variation in rates of molecular evolution. However, obtaining reliable divergence time estimates is confounded by the fact that the rate of evolution and time are intrinsically linked. I am currently working on new models for estimating branch lengths in units proportional to time without assuming a strict molecular clock. Our understanding of how substitution rates change over time is still underdeveloped. My research in this field is aimed at identifying models that accommodate rate change resulting from different processes. Such flexible models can provide more accurate estimates of species divergence times. I developed a model for estimating lineage-specific substitution rates that employs a Dirichlet process prior (DPP) for modeling mixtures of data. Under this model, branches of a phylogenetic tree are clustered into specific rate classes and the number of different rate classes and their associated rates are treated as random variables. Thus, a strict molecular clock and a model where each lineage evolves independently at a different rate are special cases of this model. Analyses of simulated data show that divergence times and branch rates are more accurate under the DPP when compared with alternative models for substitution rate variation. Furthermore, Bayesian inference under this model can identify how phylogenetic lineages are partitioned according to substitution rate, information unavailable under alternative priors. (See Heath et al., MBE 2012)