This vignette provides information about how the package stores and outputs information about species taxonomy via the
taxonomy object and available options for simulating species taxonomy.
A phylogenetic tree object, in the format used by most software packages, does not typically contain information about species over time. A timescaled phylogeny is informative about the minimum number of co-existing lineages at a given moment but not about the start and end point of species or the mode of speciation. Three modes of speciation that have been discussed widely in palaeobiological literature are shown in Fig. 1. and can be modelled using
The distinction between these processes is important because they have consequences on parameter estimates using different methods.
In molecular phylogenetics, we typically draw trees as fully bifurcating structures (i.e. each node gives rise to two descendant branches). This is also how trees are stored in computer memory by most phylogenetic software packages, including
ape. However, it may not be the case that each internal edge represents a unique species. Instead, some speciation events may be budding rather than bifurcating or anagenesis may have occurred at some point along internal edges. To explore the impact of different speciation modes it is valuable to know the relationship between species through time (the taxonomy) and the corresponding tree object.
FossilSim contains options for simulating species evolution under different speciation modes, described in the section Simulating taxonomy. Information about the true species taxonomy is stored in the
taxonomy object, which will be associated with the corresponding
phylo object used to simulate species evolution.
The object contains a dataframe detailing species taxonomy, which contains the following 7 fields for each edge and species in the corresponding
sp the true species identity label. If all species originated via budding or bifurcation this will always correspond to the terminal-most edge label (i.e. the youngest node) associated with each species. This is not the case if the data set also contains anagenetic species, when multiple species may be associated with a single edge
edge edge label of the branch in the corresponding phylo object. Note that some species may be associated with multiple edges
parent ancestor of species
sp. Parent labels follow the same convention as species. The label assigned to the parent of the origin or root will be zero
start start time of the corresponding edge and/or origin time of the species. If the corresponding edge is also the oldest edge associated with the species this value will equal the species origination time. If speciation mode is asymmetric or symmetric the speciation time will match the start time of the corresponding edge. If speciation mode is anagenetic the speciation time will be younger than the start time of the corresponding edge
end end time of the corresponding edge and/or end time of the species. If the corresponding edge is also the youngest edge associated with the species this value will equal the species end time. Unless the species end time coincides with an anagenetic speciation event, the species end time will match the end time of the corresponding edge. If the species end time coincides with an anagenetic speciation event, the end time will be older than the end time of the corresponding edge
mode mode = speciation mode. “o” = origin or “r” = root (the edge/species that began the process). “b” = asymmetric or budding speciation. “s” = symmetric or bifurcating speciation. “a” = anagenetic speciation
taxonomy object can be created by passing the above information to the
Cryptic species can also be simulated. Under cryptic speciation the descendant species will be indistinguishable from the ancestor and the
taxonomy object can incorporate this information using the additional optional fields:
cryptic TRUE if the speciation event was cryptic. If this information is not passed to
taxonomy(), the function assumes
cryptic = FALSE for all species
cryptic.id cryptic species identity, i.e. ancestral species that the current species will be identified as. If
cryptic = TRUE,
cryptic.id will differ from the true species identity
The relationship between information provided by the
taxonomy object and the corresponding
phylo object will become clear when you start simulating species and exploring the output.
taxonomy object can be used as the starting point for simulating fossils or other downstream analyses that require information about discrete species units.
Special Note The birth and death rates used as parameters in the birth-death process typically do not correspond to the rates of appearance and extinction of species under the above processes, unless speciation occurs entirely via budding.
Species evolution or taxonomy can be simulated in
FossilSim under a mixed model of speciation that can incorporate three modes of speciation – budding (or asymmetric), bifurcating (or symmetric) and anagenetic – in addition to cryptic speciation (Stadler et al. 2018). This model of mixed speciation was described previously in (Bapst 2013) and is also available as part of the
paleotree package (Bapst 2012).
In a birth-death process \(\lambda\) is the rate at which branching events occur, while \(\mu\) is the rate at which lineage termination occurs. These are the parameters typically used to simulate birth-death trees (see the code snippet below for an example). The mixed model of speciation includes three additional parameters, \(\beta, \lambda_a\) and \(\kappa\), which are defined as follows:
The relationship between these processes and the underlying tree structure in the corresponding
phylo object is described in section The taxonomy object. Simulating species evolution or taxonomy for any user specified tree in
FossilSim is straightforward.
# set the random number generator seed to generate the same results using the same code set.seed(123) # simulate a tree using TreeSim conditioned on tip number lambda = 1 mu = 0.2 tips = 8 t = TreeSim::sim.bd.taxa(n = tips, numbsim = 1, lambda = lambda, mu = mu)[] # t is an object of class phylo t #> #> Phylogenetic tree with 10 tips and 9 internal nodes. #> #> Tip labels: #> t2, t5, t8, t4, t9, t10, ... #> #> Rooted; includes branch lengths. # use t$edge, t$edge.length, t$root.edge to see the tree attributes # simulate under complete budding speciation s = sim.taxonomy(tree = t) # this is equivalent to using the default parameters beta = 0, lambda_a = 0, kappa = 0 # s is an object of class taxonomy s #> sp edge parent start end mode cryptic cryptic.id #> 1 1 1 8 0.38710924 0.00000000 b 0 1 #> 2 2 2 4 0.14461369 0.00000000 b 0 2 #> 3 3 3 6 0.08786013 0.00000000 b 0 3 #> 4 4 14 10 0.84708717 0.52482777 b 0 4 #> 5 4 17 10 0.52482777 0.14461369 b 0 4 #> 6 4 4 10 0.14461369 0.00000000 b 0 4 #> 7 5 5 8 0.10233497 0.00000000 b 0 5 #> 8 6 19 8 0.44503437 0.08786013 b 0 6 #> 9 6 6 8 0.08786013 0.00000000 b 0 6 #> 10 7 7 4 0.52482777 0.00000000 b 0 7 #> 11 8 15 10 1.53730579 0.44503437 b 0 8 #> 12 8 16 10 0.44503437 0.38710924 b 0 8 #> 13 8 18 10 0.38710924 0.10233497 b 0 8 #> 14 8 8 10 0.10233497 0.00000000 b 0 8 #> 15 9 9 10 1.17167856 0.09161934 b 0 9 #> 16 10 11 0 3.34383891 1.53730579 o 0 10 #> 17 10 12 0 1.53730579 1.17167856 o 0 10 #> 18 10 13 0 1.17167856 0.84708717 o 0 10 #> 19 10 10 0 0.84708717 0.34120366 o 0 10 #> Taxonomy representing 10 species across 19 edges.
Note that the way
sim.taxonomy assigns budding species is deterministic – at each branching event, the “left” descendant edge will always be assigned the ancestral species label and the “right” descendant edge will always be the new species. This means even without setting the random number seed
sim.taxonomy will always produce the same output when \(\beta = 0\).
FossilSim contains various options for plotting the package output. The
plot.taxonomy function will produce a plot highlighting species taxonomy along with the corresponding bifurcating tree object.
Note that R detects what type of object you are trying to plot, in this case taxonomy, and will automatically call
plot.taxonomy when you apply the function
plot to an object of class
?plot.taxonomy to view further options that can be used to change the appearance of the figure.
plot.taxonomy only plots the true species, and ignores any information about cryptic speciation (i.e. the function uses the
sp labels and ignores
Given an existing
taxonomy object you can also add anagenetic or cryptic species downstream using the functions
sim.cryptic.species. These functions also return a
# simulate taxonomy without anagenetic or cryptic species s1 = sim.taxonomy(tree = t, beta = 0.5) # simulate anagenetic species # note this function also requires the corresponding tree object s2 = sim.anagenetic.species(tree = t, species = s1, lambda.a = 1) # simulate cryptic species s3 = sim.cryptic.species(species = s2, kappa = 0.1)
These functions can not be used with
taxonomy objects that already contain anagenetic or cryptic species.
paleotree vignette to see how
FossilSim objects can be converted into
Bapst, D.W. (2012). Paleotree: An R package for paleontological and phylogenetic analyses of evolution. Methods in Ecology and Evolution, 3, 803–807.
Bapst, D.W. (2013). When can clades be potentially resolved with morphology? Plos One, 8, e62312.
Stadler, T., Gavryushkina, A., Warnock, R.C.M., Drummond, A.J. & Heath, T.A. (2018). The fossilized birth-death model for the analysis of stratigraphic range data under different speciation modes. Journal of Theoretical Biology, 447, 41–55.