Almer
functionThe Almer
function is a modification/hack of the
lmer
function in the lme4
package to
incorporate correlated effects in the random structure. The
Almer
function can be used to fit phylogenetic mixed models
(and other models with correlated random effects such as “animal
models”).
To start, we need an ultrametric phylogeny of unit depth. We can
construct this, for example, using the function rtree
of
the ape
package.
# Only a very small sample size is used
# in the interest of computational speed:
set.seed(57)
n_species <- 50
tree <- ape::rtree(n = n_species)
tree <- ape::chronopl(tree, lambda = 1)
From this we can generate the phylogenetic relatedness matrix A.
The column names of A must be the species identifier.
From this we can simulate a Brownian motion process and add some residual noise.
y <- 5 + t(chol(A))%*%rnorm(n_species, 0, 2) + # BM process with mean = 5 and sd = 2
rnorm(n_species, 0, 1) # residual variation with sd = 1
For Almer
to work, the data must include the species
identifier in addition to the species means.
Almer
can then be used to estimate the means and
variances of the process.
mod <- Almer(y ~ 1 + (1|species), data = dt, A = list(species = A))
summary(mod)
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: y ~ 1 + (1 | species)
#> Data: dt
#>
#> REML criterion at convergence: 209.8
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -1.59586 -0.60714 -0.02786 0.47236 1.77123
#>
#> Random effects:
#> Groups Name Variance Std.Dev.
#> species (Intercept) 1.816 1.348
#> Residual 2.486 1.577
#> Number of obs: 50, groups: species, 50
#>
#> Fixed effects:
#> Estimate Std. Error t value
#> (Intercept) 5.0356 0.4562 11.04
The Almer
function is flexible (it is based on
lmer
), and can include additional fixed and random effects
on top of the phylogenetic effects. Also, it is not restricted to
phylogeny related problems, for example it can be used to estimate
additive genetic variances and/or dominance variances (the argument
A
can have several entries).
Almer_SE
functionThis function extends Almer
by allowing the inclusion of
the uncertainty of the species means. To do this, we take advantage of
how weights are included in the lmer
function: the diagonal
of the residual co-variance matrix is the residual variance parameter
σ2 times the vector
of inverse weights. By using weights equal to 1/(1 + SE2/σ2),
where SE is a vector
of standard errors, the diagonal of the residual co-variance matrix is
σ2 + SE2.
Because the weights include the residual variance parameter, the
function uses an iterative approach.
To illustrate the approach, we add some arbitrary SE-values to the data
Almer_SE
can then be used to estimate the means and
variances of the process taking the uncertainty into account.
mod_SE <- Almer_SE(y ~ 1 + (1|species), data = dt, SE = dt$SE, A = list(species = A))
summary(mod_SE)
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: y ~ 1 + (1 | species)
#> Data: dt
#>
#> REML criterion at convergence: 209.8
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -1.59580 -0.60710 -0.02786 0.47233 1.77115
#>
#> Random effects:
#> Groups Name Variance Std.Dev.
#> species (Intercept) 1.816 1.348
#> Residual 2.485 1.576
#> Number of obs: 50, groups: species, 50
#>
#> Fixed effects:
#> Estimate Std. Error t value
#> (Intercept) 5.0356 0.4562 11.04
Note that the estimated residual variances represent the residual variance after correcting for the uncertainty in the means. Thus, this function can be useful for meta analyses.
The Almer_sim
function can be used to simulate the
responses, in our case species means, of the fitted models of both
Almer
and Almer_SE
. Note that the
lme4::simulate.merMod
function did not seem to work
properly when the number of random effects equal the number of
observations, and could therefore not be used.
sim_y <- Almer_sim(mod, nsim = 3)
sim_y[1:3,]
#> 3 x 3 Matrix of class "dgeMatrix"
#> Sim_1 Sim_2 Sim_3
#> 1 5.195749 7.330109 3.470472
#> 2 1.903504 3.013703 6.442719
#> 3 6.299618 6.199950 9.447092
This can further be used to do bootstrapping, implemented in
Almer_boot
.
# The number of bootstrap simulations is kept very low in the interest
# of computational speed. Often 1000 is used in real analyses.
Almer_boot_obj <- Almer_boot(mod, nsim = 10)
Almer_boot_obj$fixef
#> Mean Std. Err. 2.5% 97.5%
#> (Intercept) 4.993907 0.4755234 4.476581 5.778056
Almer_boot_obj$vcov
#> Mean Std. Err. 2.5% 97.5%
#> species 2.090493 1.857612 0.000000 4.828229
#> Residual 2.487438 1.730642 0.101273 5.069712
The phylH
function can be used to estimate the
phylogenetic heritability of a object fitted by Almer
. The
95% confidence interval is estimated by parametric bootstrapping. Both
the name of the numerator of the heritability and, unless the
phylogenetic residual is estimated as residuals in the model fit, the
name of the phylogenetic residuals need to be specified.