--- title: "Phylogenetic mixed model" author: "Geir H. Bolstad" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Phylogenetic mixed model} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} editor_options: chunk_output_type: console --- # The `Almer` function The `Almer` function is a modification/hack of the `lmer` function in the `lme4` package to incorporate correlated effects in the random structure. The `Almer` function can be used to fit phylogenetic mixed models (and other models with correlated random effects such as "animal models"). ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(evolvability) ``` To start, we need an ultrametric phylogeny of unit depth. We can construct this, for example, using the function `rtree` of the `ape` package. ```{r} # Only a very small sample size is used # in the interest of computational speed: set.seed(57) n_species <- 50 tree <- ape::rtree(n = n_species) tree <- ape::chronopl(tree, lambda = 1) ``` From this we can generate the phylogenetic relatedness matrix A. ```{r} A <- Matrix::Matrix(ape::vcv(tree), sparse = TRUE) ``` The column names of A must be the species identifier. ```{r} colnames(A) <- rownames(A) <- paste("species", 1:n_species, sep = "_") ``` From this we can simulate a Brownian motion process and add some residual noise. ```{r} y <- 5 + t(chol(A))%*%rnorm(n_species, 0, 2) + # BM process with mean = 5 and sd = 2 rnorm(n_species, 0, 1) # residual variation with sd = 1 ``` For `Almer` to work, the data must include the species identifier in addition to the species means. ```{r} dt <- data.frame(species = colnames(A), y = as.vector(y)) ``` `Almer` can then be used to estimate the means and variances of the process. ```{r} mod <- Almer(y ~ 1 + (1|species), data = dt, A = list(species = A)) summary(mod) ``` The `Almer` function is flexible (it is based on `lmer`), and can include additional fixed and random effects on top of the phylogenetic effects. Also, it is not restricted to phylogeny related problems, for example it can be used to estimate additive genetic variances and/or dominance variances (the argument `A` can have several entries). # The `Almer_SE` function This function extends `Almer` by allowing the inclusion of the uncertainty of the species means. To do this, we take advantage of how weights are included in the `lmer` function: the diagonal of the residual co-variance matrix is the residual variance parameter $\sigma^2$ times the vector of inverse weights. By using weights equal to $1/(1+SE^2/\sigma^2)$, where $SE$ is a vector of standard errors, the diagonal of the residual co-variance matrix is $\sigma^2 + SE^2$. Because the weights include the residual variance parameter, the function uses an iterative approach. To illustrate the approach, we add some arbitrary SE-values to the data ```{r} dt$SE <- runif(nrow(dt), min = 0.01, max = 0.02) ``` `Almer_SE` can then be used to estimate the means and variances of the process taking the uncertainty into account. ```{r} mod_SE <- Almer_SE(y ~ 1 + (1|species), data = dt, SE = dt$SE, A = list(species = A)) summary(mod_SE) ``` Note that the estimated residual variances represent the residual variance after correcting for the uncertainty in the means. Thus, this function can be useful for meta analyses. # Simulating data and bootstrapping The `Almer_sim` function can be used to simulate the responses, in our case species means, of the fitted models of both `Almer` and `Almer_SE`. Note that the `lme4::simulate.merMod` function did not seem to work properly when the number of random effects equal the number of observations, and could therefore not be used. ```{r} sim_y <- Almer_sim(mod, nsim = 3) sim_y[1:3,] ``` This can further be used to do bootstrapping, implemented in `Almer_boot`. ```{r} # The number of bootstrap simulations is kept very low in the interest # of computational speed. Often 1000 is used in real analyses. Almer_boot_obj <- Almer_boot(mod, nsim = 10) Almer_boot_obj$fixef Almer_boot_obj$vcov ``` # Phylogenetic heritability The `phylH` function can be used to estimate the phylogenetic heritability of a object fitted by `Almer`. The 95% confidence interval is estimated by parametric bootstrapping. Both the name of the numerator of the heritability and, unless the phylogenetic residual is estimated as residuals in the model fit, the name of the phylogenetic residuals need to be specified. ```{r} # The number of bootstrap simulations is kept very low in the interest # of computational speed. Often 1000 is used in real analyses. phylH_obj <- phylH(mod, numerator = "species", nsim = 10) phylH_obj$phylH ```