| Title: | Parametric Bootstrap, Kenward-Roger and Satterthwaite Based Methods for Test in Mixed Models |
|---|---|
| Description: | Computes p-values based on (a) Satterthwaite or Kenward-Rogers degree of freedom methods and (b) parametric bootstrap for mixed effects models as implemented in the 'lme4' package. Implements parametric bootstrap test for generalized linear mixed models as implemented in 'lme4' and generalized linear models. The package is documented in the paper by Halekoh and Højsgaard, (2012, <doi:10.18637/jss.v059.i09>). Please see 'citation("pbkrtest")' for citation details. |
| Authors: | Ulrich Halekoh [aut, cph], Søren Højsgaard [aut, cre, cph] |
| Maintainer: | Søren Højsgaard <[email protected]> |
| License: | GPL (>= 2) |
| Version: | 0.5.6 |
| Built: | 2026-05-31 07:30:35 UTC |
| Source: | https://github.com/hojsgaard/pbkrtest |
anova like function
print anovax object
anovax(object, ..., test = "x2", control = list(nsim = 1000, cl = NULL)) ## S3 method for class 'lmerMod' anovax(object, ..., test = "x2", control = list(nsim = 1000, cl = NULL)) ## Default S3 method: anovax(object, ..., test = "x2", control = list(nsim = 1000, cl = NULL)) ## S3 method for class 'anovax' print(x, ...)anovax(object, ..., test = "x2", control = list(nsim = 1000, cl = NULL)) ## S3 method for class 'lmerMod' anovax(object, ..., test = "x2", control = list(nsim = 1000, cl = NULL)) ## Default S3 method: anovax(object, ..., test = "x2", control = list(nsim = 1000, cl = NULL)) ## S3 method for class 'anovax' print(x, ...)
object |
A model object object |
... |
further arguments |
test |
A character string |
control |
A list controling simulations, only relevant for parametric bootstrapping. |
x |
anovax object |
Søren Højsgaard
lmm1 <- lmer(sugpct ~ block + sow + harvest + (1|block:harvest), data=beets) lmm0 <- update(lmm1, .~. - sow) anovax(lmm1, .~. - harvest, test="X2") anovax(lmm1, .~. - harvest, test="KR") anovax(lmm1, .~. - harvest, test="SAT") anovax(lmm1, .~. - harvest, test="PB", control=list(nsim=50, cl=1)) anovax(lmm1, test="X2") anovax(lmm1, test="KR") anovax(lmm1, test="SAT") anovax(lmm1, test="PB", control=list(nsim=50, cl=1))lmm1 <- lmer(sugpct ~ block + sow + harvest + (1|block:harvest), data=beets) lmm0 <- update(lmm1, .~. - sow) anovax(lmm1, .~. - harvest, test="X2") anovax(lmm1, .~. - harvest, test="KR") anovax(lmm1, .~. - harvest, test="SAT") anovax(lmm1, .~. - harvest, test="PB", control=list(nsim=50, cl=1)) anovax(lmm1, test="X2") anovax(lmm1, test="KR") anovax(lmm1, test="SAT") anovax(lmm1, test="PB", control=list(nsim=50, cl=1))
Various different tests for model comparison
anovax_list( object, object2, test = c("x2", "kr", "sat", "pb"), control = list(nsim = 1000) )anovax_list( object, object2, test = c("x2", "kr", "sat", "pb"), control = list(nsim = 1000) )
object |
Model object |
object2 |
Model object or equivalent way of specifying a submodel of lmm1 |
test |
A vector with the various test types. |
control |
A list controlling the model comparions. |
Dataframe with results of the various tests
Søren Højsgaard
Convert the test result table of a PBmodcomp or summary.PBmodcomp object
into a standard data frame for easy inspection or export.
## S3 method for class 'PBmodcomp' as.data.frame(x, ...) ## S3 method for class 'summary_PBmodcomp' as.data.frame(x, ...)## S3 method for class 'PBmodcomp' as.data.frame(x, ...) ## S3 method for class 'summary_PBmodcomp' as.data.frame(x, ...)
x |
An object of class |
... |
Further arguments (currently ignored). |
These methods extract the internal test component (a data frame of test statistics and p-values)
from the object and return it as a standard data frame. The original object's attributes are preserved
as attributes on the returned data frame.
This is useful for exporting or manipulating bootstrap test results in a tidy form.
A data.frame containing the test results with attributes preserved.
## Not run: sleepstudy$Days2 <- sleepstudy$Days^2 lmer_fit1 <- lme4::lmer(Reaction ~ Days + Days2 + (1 | Subject), data = sleepstudy, REML=FALSE) lmer_fit0 <- update(lmer_fit1, . ~ . - Days2) res <- pb2_modcomp(lmer_fit1, lmer_fit0, nsim = 200) df <- as.data.frame(res) head(df) s <- summary(res) as.data.frame(s) ## End(Not run) ## @seealso \code{\link{pb2_modcomp}}, \code{\link{summary.PBmodcomp}}## Not run: sleepstudy$Days2 <- sleepstudy$Days^2 lmer_fit1 <- lme4::lmer(Reaction ~ Days + Days2 + (1 | Subject), data = sleepstudy, REML=FALSE) lmer_fit0 <- update(lmer_fit1, . ~ . - Days2) res <- pb2_modcomp(lmer_fit1, lmer_fit0, nsim = 200) df <- as.data.frame(res) head(df) s <- summary(res) as.data.frame(s) ## End(Not run) ## @seealso \code{\link{pb2_modcomp}}, \code{\link{summary.PBmodcomp}}
Compare column spaces of two matrices
compare_column_space(X1, X2)compare_column_space(X1, X2)
X1, X2
|
matrices with the same number of rows |
-1 : Either C(X1)=C(X2), or the spaces are not nested.
0 : C(X1) is contained in C(X2)
1 : C(X2) is contained in C(X1)
A1 <- matrix(c(1,1,1,1,2,3), nrow=3) A2 <- A1[, 1, drop=FALSE] compare_column_space(A1, A2) compare_column_space(A2, A1) compare_column_space(A1, A1)A1 <- matrix(c(1,1,1,1,2,3), nrow=3) A2 <- A1[, 1, drop=FALSE] compare_column_space(A1, A2) compare_column_space(A2, A1) compare_column_space(A1, A1)
These methods compute the marginal covariance matrix
for fitted linear and linear mixed models of
the form
where and
are random effects and residual errors.
The returned matrix represents the implied covariance structure of the response vector, combining contributions from both random effects and residuals.
cov_matrix(fit, ...) ## S3 method for class 'lmerMod' cov_matrix(fit, ...) ## S3 method for class 'lme' cov_matrix(fit, ...) ## S3 method for class 'gls' cov_matrix(fit, ...) ## S3 method for class 'lm' cov_matrix(fit, ...)cov_matrix(fit, ...) ## S3 method for class 'lmerMod' cov_matrix(fit, ...) ## S3 method for class 'lme' cov_matrix(fit, ...) ## S3 method for class 'gls' cov_matrix(fit, ...) ## S3 method for class 'lm' cov_matrix(fit, ...)
fit |
A fitted model object. Supported classes: |
... |
Additional arguments (currently ignored). |
A sparse matrix (class "dgCMatrix") representing the
marginal covariance matrix .
cov_matrix.lmerModFor
lmerMod models from the lme4 package. Computes
.
cov_matrix.lmeFor lme models from the
nlme package. Computes block-diagonal covariance with
group-specific structures.
cov_matrix.glsFor
gls models from the nlme package. Includes
optional correlation structures.
if (require(nlme) && require(Matrix)) { ## gls example fit_gls <- gls(distance ~ age, data = Orthodont) V_gls <- cov_matrix(fit_gls) print(V_gls) ## lme example fit_lme <- lme(distance ~ age, random = ~ 1 | Subject, data = Orthodont) V_lme <- cov_matrix(fit_lme) print(V_lme) } if (require(lme4) && require(Matrix)) { ## lmerMod example fit_lmer <- lme4::lmer(Reaction ~ Days + (Days | Subject), sleepstudy) V_lmer <- cov_matrix(fit_lmer) print(V_lmer) }if (require(nlme) && require(Matrix)) { ## gls example fit_gls <- gls(distance ~ age, data = Orthodont) V_gls <- cov_matrix(fit_gls) print(V_gls) ## lme example fit_lme <- lme(distance ~ age, random = ~ 1 | Subject, data = Orthodont) V_lme <- cov_matrix(fit_lme) print(V_lme) } if (require(lme4) && require(Matrix)) { ## lmerMod example fit_lmer <- lme4::lmer(Reaction ~ Days + (Days | Subject), sleepstudy) V_lmer <- cov_matrix(fit_lmer) print(V_lmer) }
Extracts estimated variance components and correlation parameters that define the covariance structure in linear and mixed-effects models.
cov_par(fit, ...) ## S3 method for class 'lmerMod' cov_par(fit, ...) ## S3 method for class 'lme' cov_par(fit, ...) ## S3 method for class 'gls' cov_par(fit, ...) ## S3 method for class 'lm' cov_par(fit, ...)cov_par(fit, ...) ## S3 method for class 'lmerMod' cov_par(fit, ...) ## S3 method for class 'lme' cov_par(fit, ...) ## S3 method for class 'gls' cov_par(fit, ...) ## S3 method for class 'lm' cov_par(fit, ...)
fit |
A fitted model object. Supported classes: |
... |
Currently ignored. |
The function returns parameters describing the model-implied covariance matrix, such as residual variances, random effect variances, and correlation parameters (e.g. AR(1), compound symmetry).
A list or matrix describing the estimated variance and correlation parameters:
A VarCorr object summarizing random effects and residual variance.
A list with variance components (variances and standard deviations) and optional correlation structure parameters.
A list containing residual variance and any correlation or variance structure parameters.
Residual standard deviation.
cov_par.lmerModExtracts random effect and residual variances from lme4 models.
cov_par.lmeReturns variance components, residual variance, and any correlation structure parameters from nlme models.
cov_par.glsExtracts residual variance and optional correlation or heteroscedasticity parameters from nlme GLS models.
cov_par.lmReturns residual standard deviation from a linear model.
if (require(nlme) && require(lme4)) { data(Orthodont, package = "nlme") # GLS model fit_gls <- gls(distance ~ age, data = Orthodont) cov_par(fit_gls) # LME model fit_lme <- lme(distance ~ age, random = ~ 1 | Subject, data = Orthodont) cov_par(fit_lme) # lmerMod model fit_lmer <- lme4::lmer(distance ~ age + (1 | Subject), data = Orthodont) cov_par(fit_lmer) # lm model fit_lm <- lm(distance ~ age, data = Orthodont) cov_par(fit_lm) }if (require(nlme) && require(lme4)) { data(Orthodont, package = "nlme") # GLS model fit_gls <- gls(distance ~ age, data = Orthodont) cov_par(fit_gls) # LME model fit_lme <- lme(distance ~ age, random = ~ 1 | Subject, data = Orthodont) cov_par(fit_lme) # lmerMod model fit_lmer <- lme4::lmer(distance ~ age + (1 | Subject), data = Orthodont) cov_par(fit_lmer) # lm model fit_lm <- lm(distance ~ age, data = Orthodont) cov_par(fit_lm) }
Yield and sugar percentage in sugar beets from a split plot
experiment. The experimental layout was as follows: There were
three blocks. In each block, the harvest time defines the
"whole plot" and the sowing time defines the "split plot". Each
plot was and the yield is recorded in kg. See
'details' for the experimental layout. The data originates from
a study carried out at The Danish Institute for Agricultural
Sciences (the institute does not exist any longer; it became
integrated in a Danish university).
beetsbeets
A dataframe with 5 columns and 30 rows.
Experimental plan
Sowing times 1 4. april
2 12. april
3 21. april
4 29. april
5 18. may
Harvest times 1 2. october
2 21. october
Plot allocation:
Block 1 Block 2 Block 3
+-----------|-----------|-----------+
Plot | 1 1 1 1 1 | 2 2 2 2 2 | 1 1 1 1 1 | Harvest time
1-15 | 3 4 5 2 1 | 3 2 4 5 1 | 5 2 3 4 1 | Sowing time
|-----------|-----------|-----------|
Plot | 2 2 2 2 2 | 1 1 1 1 1 | 2 2 2 2 2 | Harvest time
16-30 | 2 1 5 4 3 | 4 1 3 2 5 | 1 4 3 2 5 | Sowing time
+-----------|-----------|-----------+
Ulrich Halekoh, Søren Højsgaard (2014)., A Kenward-Roger Approximation and Parametric Bootstrap Methods for Tests in Linear Mixed Models - The R Package pbkrtest., Journal of Statistical Software, 58(10), 1-30., https://www.jstatsoft.org/v59/i09/
data(beets) beets$bh <- with(beets, interaction(block, harvest)) summary(aov(yield ~ block + sow + harvest + Error(bh), beets)) summary(aov(sugpct ~ block + sow + harvest + Error(bh), beets))data(beets) beets$bh <- with(beets, interaction(block, harvest)) summary(aov(yield ~ block + sow + harvest + Error(bh), beets)) summary(aov(sugpct ~ block + sow + harvest + Error(bh), beets))
Experiment on the toxicity to the tobacco budworm Heliothis virescens of doses of the pyrethroid trans-cypermethrin to which the moths were beginning to show resistance. Batches of 20 moths of each sex were exposed for three days to the pyrethroid and the number in each batch that were dead or knocked down was recorded. Data is reported in Collett (1991, p. 75).
budwormbudworm
This data frame contains 12 rows and 4 columns:
sex of the budworm.
dose of the insecticide trans-cypermethrin (in micro grams)
.
budworms killed in a trial.
total number of budworms exposed per trial.
Collett, D. (1991) Modelling Binary Data, Chapman & Hall, London, Example 3.7
Venables, W.N; Ripley, B.D.(1999) Modern Applied Statistics with S-Plus, Heidelberg, Springer, 3rd edition, chapter 7.2
data(budworm) ## function to caclulate the empirical logits empirical.logit<- function(nevent,ntotal) { y <- log((nevent + 0.5) / (ntotal - nevent + 0.5)) y } # plot the empirical logits against log-dose log.dose <- log(budworm$dose) emp.logit <- empirical.logit(budworm$ndead, budworm$ntotal) plot(log.dose, emp.logit, type='n', xlab='log-dose',ylab='emprirical logit') title('budworm: emprirical logits of probability to die ') male <- budworm$sex=='male' female <- budworm$sex=='female' lines(log.dose[male], emp.logit[male], type='b', lty=1, col=1) lines(log.dose[female], emp.logit[female], type='b', lty=2, col=2) legend(0.5, 2, legend=c('male', 'female'), lty=c(1,2), col=c(1,2)) ## Not run: * SAS example; data budworm; infile 'budworm.txt' firstobs=2; input sex dose ndead ntotal; run; ## End(Not run)data(budworm) ## function to caclulate the empirical logits empirical.logit<- function(nevent,ntotal) { y <- log((nevent + 0.5) / (ntotal - nevent + 0.5)) y } # plot the empirical logits against log-dose log.dose <- log(budworm$dose) emp.logit <- empirical.logit(budworm$ndead, budworm$ntotal) plot(log.dose, emp.logit, type='n', xlab='log-dose',ylab='emprirical logit') title('budworm: emprirical logits of probability to die ') male <- budworm$sex=='male' female <- budworm$sex=='female' lines(log.dose[male], emp.logit[male], type='b', lty=1, col=1) lines(log.dose[female], emp.logit[female], type='b', lty=2, col=2) legend(0.5, 2, legend=c('male', 'female'), lty=c(1,2), col=c(1,2)) ## Not run: * SAS example; data budworm; infile 'budworm.txt' firstobs=2; input sex dose ndead ntotal; run; ## End(Not run)
Return fitted values
fitted0(object, ...)fitted0(object, ...)
object |
A fitted model |
... |
Not used |
Get adjusted denominator degrees freedom for testing Lb=0 in a linear mixed model where L is a restriction matrix.
get_Lb_ddf(object, L) ## S3 method for class 'lmerMod' get_Lb_ddf(object, L) Lb_ddf(L, V0, Vadj) get_ddf_Lb(object, Lcoef) ## S3 method for class 'lmerMod' get_ddf_Lb(object, Lcoef) ddf_Lb(VVa, Lcoef, VV0 = VVa)get_Lb_ddf(object, L) ## S3 method for class 'lmerMod' get_Lb_ddf(object, L) Lb_ddf(L, V0, Vadj) get_ddf_Lb(object, Lcoef) ## S3 method for class 'lmerMod' get_ddf_Lb(object, Lcoef) ddf_Lb(VVa, Lcoef, VV0 = VVa)
object |
A linear mixed model object. |
L |
A vector with the same length as |
V0, Vadj
|
The unadjusted and the adjusted covariance matrices for the fixed
effects parameters. The unadjusted covariance matrix is obtained with
|
Lcoef |
Linear contrast matrix |
VVa |
Adjusted covariance matrix |
VV0 |
Unadjusted covariance matrix |
Adjusted degrees of freedom (adjustment made by a Kenward-Roger approximation).
Søren Højsgaard, [email protected]
Ulrich Halekoh, Søren Højsgaard (2014)., A Kenward-Roger Approximation and Parametric Bootstrap Methods for Tests in Linear Mixed Models - The R Package pbkrtest., Journal of Statistical Software, 58(10), 1-30., https://www.jstatsoft.org/v59/i09/
KRmodcomp, vcovAdj,
model2restriction_matrix,
restriction_matrix2model
(fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)) (fm0 <- lmer(Reaction ~ 1 + (Days|Subject), sleepstudy)) anova(fm1, fm0) KRmodcomp(fm1, fm0) ## 17 denominator df's get_Lb_ddf(fm1, c(0, 1)) ## 17 denominator df's # Notice: The restriction matrix L corresponding to the test above # can be found with L <- model2restriction_matrix(fm1, fm0) L(fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)) (fm0 <- lmer(Reaction ~ 1 + (Days|Subject), sleepstudy)) anova(fm1, fm0) KRmodcomp(fm1, fm0) ## 17 denominator df's get_Lb_ddf(fm1, c(0, 1)) ## 17 denominator df's # Notice: The restriction matrix L corresponding to the test above # can be found with L <- model2restriction_matrix(fm1, fm0) L
KRmodcomp or
SATmodcomp object.Extract (or "get") components from a KRmodcomp
or SATmodcomp object. In particular, get denominator
degrees of freedom.
getKR( object, name = c("ndf", "ddf", "Fstat", "p.value", "F.scaling", "FstatU", "p.valueU", "aux") ) getSAT(object, name = c("ndf", "ddf", "Fstat", "p.value"))getKR( object, name = c("ndf", "ddf", "Fstat", "p.value", "F.scaling", "FstatU", "p.valueU", "aux") ) getSAT(object, name = c("ndf", "ddf", "Fstat", "p.value"))
object |
A |
name |
The available slots. If |
Søren Højsgaard [email protected]
Ulrich Halekoh, Søren Højsgaard (2014)., A Kenward-Roger Approximation and Parametric Bootstrap Methods for Tests in Linear Mixed Models - The R Package pbkrtest., Journal of Statistical Software, 58(10), 1-30., https://www.jstatsoft.org/v59/i09/
(fm0 <- lmer(Reaction ~ (Days|Subject), sleepstudy)) (fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)) (fm2 <- lmer(Reaction ~ Days + I(Days^2) + (Days|Subject), sleepstudy)) x10 <- KRmodcomp(fm1, fm0) getKR(x10, "ddf") KRmodcomp(fm1, fm0) |> getKR("ddf") KRmodcomp(fm2, fm0) |> getKR("ddf") KRmodcomp(fm2, fm1) |> getKR("ddf") ## For comparison: SATmodcomp(fm1, fm0) |> getSAT("ddf") SATmodcomp(fm2, fm0) |> getSAT("ddf") SATmodcomp(fm2, fm1) |> getSAT("ddf")(fm0 <- lmer(Reaction ~ (Days|Subject), sleepstudy)) (fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)) (fm2 <- lmer(Reaction ~ Days + I(Days^2) + (Days|Subject), sleepstudy)) x10 <- KRmodcomp(fm1, fm0) getKR(x10, "ddf") KRmodcomp(fm1, fm0) |> getKR("ddf") KRmodcomp(fm2, fm0) |> getKR("ddf") KRmodcomp(fm2, fm1) |> getKR("ddf") ## For comparison: SATmodcomp(fm1, fm0) |> getSAT("ddf") SATmodcomp(fm2, fm0) |> getSAT("ddf") SATmodcomp(fm2, fm1) |> getSAT("ddf")
Constructs or extracts a nested model (fit0) from a full model (fit1)
using flexible input: a model object, formula, character string, or matrix.
This function is useful for preparing models for comparison, e.g., via likelihood ratio test.
get_nested_model_info(fit1, fit0)get_nested_model_info(fit1, fit0)
fit1 |
A fitted model object (e.g., from |
fit0 |
A nested model specification: a model object, a formula (e.g., |
A list with:
Formula for fit1.
Formula for resolved fit0.
The full model fit1.
The nested model fit0.
Restriction matrix defining the nested model.
if (requireNamespace("lme4", quietly = TRUE)) { library(lme4) data(sleepstudy) fit1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) fit0 <- lmer(Reaction ~ (Days | Subject), sleepstudy) get_nested_model_info(fit1, fit0) # as model object get_nested_model_info(fit1, ~ . - Days) # as formula get_nested_model_info(fit1, "Days") # as string ## get_nested_model_info(fit1, c(0, 1)) # numeric (converted to matrix) }if (requireNamespace("lme4", quietly = TRUE)) { library(lme4) data(sleepstudy) fit1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) fit0 <- lmer(Reaction ~ (Days | Subject), sleepstudy) get_nested_model_info(fit1, fit0) # as model object get_nested_model_info(fit1, ~ . - Days) # as formula get_nested_model_info(fit1, "Days") # as string ## get_nested_model_info(fit1, c(0, 1)) # numeric (converted to matrix) }
Performs a likelihood ratio test (LRT) between two nested models. Supports
models of class lm, lmerMod, glmerMod, lme, and gls.
getLRT(fit1, fit0)getLRT(fit1, fit0)
fit1 |
A model object representing the more complex (full) model. |
fit0 |
A model object representing the simpler (nested) model. |
A named numeric vector with:
Test statistic (twice the difference in log-likelihoods).
Degrees of freedom (difference in number of parameters).
P-value from the chi-squared distribution.
## lm fit1 <- lm(mpg ~ wt + hp, data = mtcars) fit0 <- lm(mpg ~ wt, data = mtcars) getLRT(fit1, fit0) ## lmerMod if (requireNamespace("lme4", quietly = TRUE)) { library(lme4) fit1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy, REML = FALSE) fit0 <- lmer(Reaction ~ 1 + (Days | Subject), sleepstudy, REML = FALSE) getLRT(fit1, fit0) } ## glmerMod if (requireNamespace("lme4", quietly = TRUE)) { library(lme4) data(cbpp) fit1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd), data = cbpp, family = binomial) fit0 <- glmer(cbind(incidence, size - incidence) ~ 1 + (1 | herd), data = cbpp, family = binomial) getLRT(fit1, fit0) } ## lme if (requireNamespace("nlme", quietly = TRUE)) { library(nlme) fit1 <- lme(distance ~ age + Sex, random = ~1 | Subject, data = Orthodont, method = "ML") fit0 <- lme(distance ~ age, random = ~1 | Subject, data = Orthodont, method = "ML") getLRT(fit1, fit0) } ## gls if (requireNamespace("nlme", quietly = TRUE)) { library(nlme) fit1 <- gls(mpg ~ wt + hp, data = mtcars, method = "ML") fit0 <- gls(mpg ~ wt, data = mtcars, method = "ML") getLRT(fit1, fit0) }## lm fit1 <- lm(mpg ~ wt + hp, data = mtcars) fit0 <- lm(mpg ~ wt, data = mtcars) getLRT(fit1, fit0) ## lmerMod if (requireNamespace("lme4", quietly = TRUE)) { library(lme4) fit1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy, REML = FALSE) fit0 <- lmer(Reaction ~ 1 + (Days | Subject), sleepstudy, REML = FALSE) getLRT(fit1, fit0) } ## glmerMod if (requireNamespace("lme4", quietly = TRUE)) { library(lme4) data(cbpp) fit1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd), data = cbpp, family = binomial) fit0 <- glmer(cbind(incidence, size - incidence) ~ 1 + (1 | herd), data = cbpp, family = binomial) getLRT(fit1, fit0) } ## lme if (requireNamespace("nlme", quietly = TRUE)) { library(nlme) fit1 <- lme(distance ~ age + Sex, random = ~1 | Subject, data = Orthodont, method = "ML") fit0 <- lme(distance ~ age, random = ~1 | Subject, data = Orthodont, method = "ML") getLRT(fit1, fit0) } ## gls if (requireNamespace("nlme", quietly = TRUE)) { library(nlme) fit1 <- gls(mpg ~ wt + hp, data = mtcars, method = "ML") fit0 <- gls(mpg ~ wt, data = mtcars, method = "ML") getLRT(fit1, fit0) }
Glance method for nlme::gls objects
glance.gls(x, ...)glance.gls(x, ...)
x |
A fitted model of class "gls" (from nlme). |
... |
Unused (for S3 compatibility). |
one-row data.frame with overall fit statistics.
Søren Højsgaard, [email protected]
An approximate F-test based on the Kenward-Roger approach.
KRmodcomp(largeModel, smallModel, betaH = 0, details = 0) ## S3 method for class 'lmerMod' KRmodcomp(largeModel, smallModel, betaH = 0, details = 0)KRmodcomp(largeModel, smallModel, betaH = 0, details = 0) ## S3 method for class 'lmerMod' KRmodcomp(largeModel, smallModel, betaH = 0, details = 0)
largeModel |
An |
smallModel |
An |
betaH |
A number or a vector of the beta of the hypothesis,
e.g. L beta=L betaH. If |
details |
If larger than 0 some timing details are printed. |
An F test is calculated according to the approach of Kenward and
Roger (1997). The function works for linear mixed models fitted
with the lmer() function of the lme4 package. Only models where
the covariance structure is a linear combination (a weighted sum)
of known matrices can be compared.
The smallModel is the model to be tested against the largeModel.
The largeModel is a model fitted with lmer(). A technical
detail: The model must be fitted with REML=TRUE. If the model is
fitted with REML=FALSE then the model is refitted with
REML=TRUE before the p-values are calculated. Put differently,
the user needs not worry about this issue.
The smallModel can be one of several things:
a model fitted with lmer(). It must have the same covariance
structure as largeModel. Furthermore, its linear space of
expectation must be a subspace of the space for largeModel.
a restriction matrix L specifying the hypothesis
where L is a k x p matrix (there are k restrictions and p is
the number of fixed effect parameters (the length of
fixef(largeModel)) and beta_H is a p column vector.
A formula or a text string specifying what is to be removed from the larger model to form the smaller model.
Notice: if you want to test a hypothesis
with a vector , a suitable is obtained
via where is a g-inverse of .
Notice: It cannot be guaranteed that the results agree with other implementations of the Kenward-Roger approach!
Ulrich Halekoh [email protected], Søren Højsgaard [email protected]
Ulrich Halekoh, Søren Højsgaard (2014)., A Kenward-Roger Approximation and Parametric Bootstrap Methods for Tests in Linear Mixed Models - The R Package pbkrtest., Journal of Statistical Software, 58(10), 1-30., https://www.jstatsoft.org/v59/i09/
Kenward, M. G. and Roger, J. H. (1997), Small Sample Inference for Fixed Effects from Restricted Maximum Likelihood, Biometrics 53: 983-997.
getKR, lmer,
vcovAdj, PBmodcomp,
SATmodcomp
(fm0 <- lmer(Reaction ~ (Days|Subject), sleepstudy)) (fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)) (fm2 <- lmer(Reaction ~ Days + I(Days^2) + (Days|Subject), sleepstudy)) ## Test for no effect of Days in fm1, i.e. test fm0 under fm1 KRmodcomp(fm1, "Days") KRmodcomp(fm1, ~.-Days) L1 <- cbind(0, 1) KRmodcomp(fm1, L1) KRmodcomp(fm1, fm0) anova(fm1, fm0) ## Test for no effect of Days and Days-squared in fm2, i.e. test fm0 under fm2 KRmodcomp(fm2, "(Days+I(Days^2))") KRmodcomp(fm2, ~. - Days - I(Days^2)) L2 <- rbind(c(0, 1, 0), c(0, 0, 1)) KRmodcomp(fm2, L2) KRmodcomp(fm2, fm0) anova(fm2, fm0) ## Test for no effect of Days-squared in fm2, i.e. test fm1 under fm2 KRmodcomp(fm2, "I(Days^2)") KRmodcomp(fm2, ~. - I(Days^2)) L3 <- rbind(c(0, 0, 1)) KRmodcomp(fm2, L3) KRmodcomp(fm2, fm1) anova(fm2, fm1)(fm0 <- lmer(Reaction ~ (Days|Subject), sleepstudy)) (fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)) (fm2 <- lmer(Reaction ~ Days + I(Days^2) + (Days|Subject), sleepstudy)) ## Test for no effect of Days in fm1, i.e. test fm0 under fm1 KRmodcomp(fm1, "Days") KRmodcomp(fm1, ~.-Days) L1 <- cbind(0, 1) KRmodcomp(fm1, L1) KRmodcomp(fm1, fm0) anova(fm1, fm0) ## Test for no effect of Days and Days-squared in fm2, i.e. test fm0 under fm2 KRmodcomp(fm2, "(Days+I(Days^2))") KRmodcomp(fm2, ~. - Days - I(Days^2)) L2 <- rbind(c(0, 1, 0), c(0, 0, 1)) KRmodcomp(fm2, L2) KRmodcomp(fm2, fm0) anova(fm2, fm0) ## Test for no effect of Days-squared in fm2, i.e. test fm1 under fm2 KRmodcomp(fm2, "I(Days^2)") KRmodcomp(fm2, ~. - I(Days^2)) L3 <- rbind(c(0, 0, 1)) KRmodcomp(fm2, L3) KRmodcomp(fm2, fm1) anova(fm2, fm1)
Kenward and Roger (1997) describe an improved small sample approximation to the covariance matrix estimate of the fixed parameters in a linear mixed model.
vcovAdj(object, details = 0) ## S3 method for class 'lmerMod' vcovAdj(object, details = 0)vcovAdj(object, details = 0) ## S3 method for class 'lmerMod' vcovAdj(object, details = 0)
object |
An |
details |
If larger than 0 some timing details are printed. |
phiA |
the estimated covariance matrix, this has attributed P, a
list of matrices used in |
SigmaG |
list: Sigma: the covariance matrix of Y; G: the G matrices that
sum up to Sigma; |
If $N$ is the number of observations, then the vcovAdj()
function involves inversion of an $N x N$ matrix, so the computations can
be relatively slow.
Ulrich Halekoh [email protected], Søren Højsgaard [email protected]
Ulrich Halekoh, Søren Højsgaard (2014)., A Kenward-Roger Approximation and Parametric Bootstrap Methods for Tests in Linear Mixed Models - The R Package pbkrtest., Journal of Statistical Software, 58(10), 1-30., https://www.jstatsoft.org/v59/i09/
Kenward, M. G. and Roger, J. H. (1997), Small Sample Inference for Fixed Effects from Restricted Maximum Likelihood, Biometrics 53: 983-997.
getKR, KRmodcomp, lmer,
PBmodcomp, vcovAdj
fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy, REML=TRUE) class(fm1) set.seed(123) sleepstudy2 <- sleepstudy[sample(nrow(sleepstudy), size=120), ] fm2 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy2, REML=TRUE) ## Here the adjusted and unadjusted covariance matrices are identical, ## but that is not generally the case: v1 <- vcov(fm1) v1a <- vcovAdj(fm1, details=0) v1a / v1 v2 <- vcov(fm2) v2a <- vcovAdj(fm2, details=0) v2a / v2 # For comparison, an alternative estimate of the # variance-covariance matrix is based on parametric bootstrap (and # this is easily parallelized):fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy, REML=TRUE) class(fm1) set.seed(123) sleepstudy2 <- sleepstudy[sample(nrow(sleepstudy), size=120), ] fm2 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy2, REML=TRUE) ## Here the adjusted and unadjusted covariance matrices are identical, ## but that is not generally the case: v1 <- vcov(fm1) v1a <- vcovAdj(fm1, details=0) v1a / v1 v2 <- vcov(fm2) v2a <- vcovAdj(fm2, details=0) v2a / v2 # For comparison, an alternative estimate of the # variance-covariance matrix is based on parametric bootstrap (and # this is easily parallelized):
Testing a small model under a large model corresponds imposing restrictions on the model matrix of the larger model and these restrictions come in the form of a restriction matrix. These functions converts a model to a restriction matrix and vice versa.
model2restriction_matrix(largeModel, smallModel, sparse = FALSE) restriction_matrix2model(largeModel, L, REML = TRUE, ...) make_model_matrix(X, L) make_restriction_matrix(X, X2)model2restriction_matrix(largeModel, smallModel, sparse = FALSE) restriction_matrix2model(largeModel, L, REML = TRUE, ...) make_model_matrix(X, L) make_restriction_matrix(X, X2)
largeModel, smallModel
|
Model objects of the same "type". Possible types are linear mixed effects models and linear models (including generalized linear models) |
sparse |
Should the restriction matrix be sparse or dense? |
L |
A restriction matrix; a full rank matrix with as many columns as |
REML |
Controls if new model object should be fitted with REML or ML. |
... |
Additional arguments; not used. |
X, X2
|
Model matrices. Must have same number of rows. |
make_restriction_matrix Make a restriction matrix. If span(X2) is in
span(X) then the corresponding restriction matrix L is
returned.
model2restriction_matrix: A restriction matrix.
restriction_matrix2model: A model object.
That these functions are visible is a recent addition; minor changes may occur.
Ulrich Halekoh [email protected], Søren Højsgaard [email protected]
Ulrich Halekoh, Søren Højsgaard (2014)., A Kenward-Roger Approximation and Parametric Bootstrap Methods for Tests in Linear Mixed Models - The R Package pbkrtest., Journal of Statistical Software, 58(10), 1-30., https://www.jstatsoft.org/v59/i09/
PBmodcomp, PBrefdist,
KRmodcomp
library(pbkrtest) data("beets", package = "pbkrtest") sug <- lm(sugpct ~ block + sow + harvest, data=beets) sug.h <- update(sug, .~. - harvest) sug.s <- update(sug, .~. - sow) ## Construct restriction matrices from models L.h <- model2restriction_matrix(sug, sug.h); L.h L.s <- model2restriction_matrix(sug, sug.s); L.s ## Construct submodels from restriction matrices mod.h <- restriction_matrix2model(sug, L.h); mod.h mod.s <- restriction_matrix2model(sug, L.s); mod.s ## Sanity check: The models have the same fitted values and log likelihood max(abs(fitted(mod.h) - fitted(sug.h))) max(abs(fitted(mod.s) - fitted(sug.s))) logLik(mod.h) logLik(sug.h) logLik(mod.s) logLik(sug.s)library(pbkrtest) data("beets", package = "pbkrtest") sug <- lm(sugpct ~ block + sow + harvest, data=beets) sug.h <- update(sug, .~. - harvest) sug.s <- update(sug, .~. - sow) ## Construct restriction matrices from models L.h <- model2restriction_matrix(sug, sug.h); L.h L.s <- model2restriction_matrix(sug, sug.s); L.s ## Construct submodels from restriction matrices mod.h <- restriction_matrix2model(sug, L.h); mod.h mod.s <- restriction_matrix2model(sug, L.s); mod.s ## Sanity check: The models have the same fitted values and log likelihood max(abs(fitted(mod.h) - fitted(sug.h))) max(abs(fitted(mod.s) - fitted(sug.s))) logLik(mod.h) logLik(sug.h) logLik(mod.s) logLik(sug.s)
Model comparison of nested models using parametric bootstrap methods. Implemented for some commonly applied model types.
PBmodcomp( largeModel, smallModel, nsim = 1000, ref = NULL, seed = NULL, cl = NULL, details = 0 ) ## S3 method for class 'merMod' PBmodcomp( largeModel, smallModel, nsim = 1000, ref = NULL, seed = NULL, cl = NULL, details = 0 ) ## S3 method for class 'lm' PBmodcomp( largeModel, smallModel, nsim = 1000, ref = NULL, seed = NULL, cl = NULL, details = 0 ) ## S3 method for class 'gls' PBmodcomp( largeModel, smallModel, nsim = 1000, ref = NULL, seed = NULL, cl = NULL, details = 0 ) seqPBmodcomp(largeModel, smallModel, h = 20, nsim = 1000, cl = 1)PBmodcomp( largeModel, smallModel, nsim = 1000, ref = NULL, seed = NULL, cl = NULL, details = 0 ) ## S3 method for class 'merMod' PBmodcomp( largeModel, smallModel, nsim = 1000, ref = NULL, seed = NULL, cl = NULL, details = 0 ) ## S3 method for class 'lm' PBmodcomp( largeModel, smallModel, nsim = 1000, ref = NULL, seed = NULL, cl = NULL, details = 0 ) ## S3 method for class 'gls' PBmodcomp( largeModel, smallModel, nsim = 1000, ref = NULL, seed = NULL, cl = NULL, details = 0 ) seqPBmodcomp(largeModel, smallModel, h = 20, nsim = 1000, cl = 1)
largeModel, smallModel
|
Two models |
nsim |
The number of simulations to form the reference distribution. |
ref |
Vector containing samples from the reference
distribution. If NULL, this vector will be generated using
|
seed |
A seed that will be passed to the simulation of new datasets. |
cl |
A vector identifying a cluster; used for calculating the reference distribution using several cores. See examples below. |
details |
The amount of output produced. Mainly relevant for debugging purposes. |
h |
For sequential computing for bootstrap p-values: The number of extreme cases needed to generate before the sampling process stops. |
The model object must be fitted with maximum likelihood
(i.e. with REML=FALSE). If the object is fitted with
restricted maximum likelihood (i.e. with REML=TRUE) then
the model is refitted with REML=FALSE before the
p-values are calculated. Put differently, the user needs not
worry about this issue.
Under the fitted hypothesis (i.e. under the fitted small model) nsim
samples of the likelihood ratio test statistic (LRT) are generated.
Then p-values are calculated as follows:
LRT: Assuming that LRT has a chi-square distribution.
PBtest: The fraction of simulated LRT-values that are larger or equal to the observed LRT value.
Bartlett: A Bartlett correction is of LRT is calculated from the mean of the simulated LRT-values
Gamma: The reference distribution of LRT is assumed to be a gamma distribution with mean and variance determined as the sample mean and sample variance of the simulated LRT-values.
F: The LRT divided by the number of degrees of freedom is assumed to be F-distributed, where the denominator degrees of freedom are determined by matching the first moment of the reference distribution.
It can happen that some values of the LRT statistic in the reference distribution are negative. When this happens one will see that the number of used samples (those where the LRT is positive) are reported (this number is smaller than the requested number of samples).
In theory one can not have a negative value of the LRT statistic but in practice on can: We speculate that the reason is as follows: We simulate data under the small model and fit both the small and the large model to the simulated data. Therefore the large model represents - by definition - an over fit; the model has superfluous parameters in it. Therefore the fit of the two models will for some simulated datasets be very similar resulting in similar values of the log-likelihood. There is no guarantee that the the log-likelihood for the large model in practice always will be larger than for the small (convergence problems and other numerical issues can play a role here).
To look further into the problem, one can use the PBrefdist() function
for simulating the reference distribution (this reference distribution can be
provided as input to PBmodcomp()). Inspection sometimes reveals that
while many values are negative, they are numerically very small. In this case
one may try to replace the negative values by a small positive value and then
invoke PBmodcomp() to get some idea about how strong influence there
is on the resulting p-values. (The p-values get smaller this way compared to
the case when only the originally positive values are used).
Søren Højsgaard [email protected]
Ulrich Halekoh, Søren Højsgaard (2014)., A Kenward-Roger Approximation and Parametric Bootstrap Methods for Tests in Linear Mixed Models - The R Package pbkrtest., Journal of Statistical Software, 58(10), 1-30., https://www.jstatsoft.org/v59/i09/
## Not run: (fm0 <- lmer(Reaction ~ (Days|Subject), sleepstudy)) (fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)) (fm2 <- lmer(Reaction ~ Days + I(Days^2) + (Days|Subject), sleepstudy)) NSIM <- 50 ## Simulations in parametric bootstrap; default is 1000. ## Test for no effect of Days in fm1, i.e. test fm0 under fm1 PBmodcomp(fm1, "Days", cl=1, nsim=NSIM) PBmodcomp(fm1, ~.-Days, cl=1, nsim=NSIM) L1 <- cbind(0, 1) PBmodcomp(fm1, L1, cl=1, nsim=NSIM) PBmodcomp(fm1, fm0, cl=1, nsim=NSIM) anova(fm1, fm0) ## Test for no effect of Days and Days-squared in fm2, i.e. test fm0 under fm2 PBmodcomp(fm2, "(Days+I(Days^2))", cl=1, nsim=NSIM) PBmodcomp(fm2, ~. - Days - I(Days^2), cl=1, nsim=NSIM) L2 <- rbind(c(0, 1, 0), c(0, 0, 1)) PBmodcomp(fm2, L2, cl=1, nsim=NSIM) ## FIXME PBmodcomp(fm2, fm0, cl=1, nsim=NSIM) anova(fm2, fm0) ## Test for no effect of Days-squared in fm2, i.e. test fm1 under fm2 PBmodcomp(fm2, "I(Days^2)", cl=1, nsim=NSIM) PBmodcomp(fm2, ~. - I(Days^2), cl=1, nsim=NSIM) L3 <- rbind(c(0, 0, 1)) PBmodcomp(fm2, L3, cl=1, nsim=NSIM) PBmodcomp(fm2, fm1, cl=1, nsim=NSIM) anova(fm2, fm1) ## Linear normal model: sug <- lm(sugpct ~ block + sow + harvest, data=beets) sug.h <- update(sug, .~. -harvest) sug.s <- update(sug, .~. -sow) PBmodcomp(sug, "harvest", nsim=NSIM, cl=1) PBmodcomp(sug, ~. - harvest, nsim=NSIM, cl=1) PBmodcomp(sug, sug.h, nsim=NSIM, cl=1) anova(sug, sug.h) ## Generalized linear model mm <- glm(ndead/ntotal ~ sex + log(dose), family=binomial, weight=ntotal, data=budworm) mm0 <- update(mm, .~. -sex) ### Test for no effect of sex PBmodcomp(mm, "sex", cl=1, nsim=NSIM) PBmodcomp(mm, ~.-sex, cl=1, nsim=NSIM) ## PBmodcomp(mm, cbind(0, 1, 0), nsim=NSIM): FIXME PBmodcomp(mm, mm0, cl=1, nsim=NSIM) anova(mm, mm0, test="Chisq") ## End(Not run) ## Generalized linear mixed model (it takes a while to fit these) ## Not run: (gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd), data = cbpp, family = binomial)) (gm2 <- update(gm1, .~.-period)) PBmodcomp(gm1, "period", nsim=NSIM) PBmodcomp(gm1, ~. -period, nsim=NSIM) PBmodcomp(gm1, gm2, nsim=NSIM) anova(gm1, gm2) ## End(Not run) ## Not run: ## Linear mixed effects model: sug <- lmer(sugpct ~ block + sow + harvest + (1|block:harvest), data=beets, REML=FALSE) sug.h <- update(sug, .~. -harvest) sug.s <- update(sug, .~. -sow) anova(sug, sug.h) PBmodcomp(sug, sug.h, nsim=NSIM, cl=1) PBmodcomp(sug, "harvest", nsim=NSIM, cl=1) anova(sug, sug.s) PBmodcomp(sug, sug.s, nsim=NSIM, cl=1) PBmodcomp(sug, "sow", nsim=NSIM, cl=1) ## Simulate reference distribution separately: refdist <- PBrefdist(sug, sug.h, nsim=1000, cl=1) refdist <- PBrefdist(sug, "harvest", nsim=1000, cl=1) refdist <- PBrefdist(sug, ~.-harvest, nsim=1000, cl=1) ## Do computations with multiple processors: ## Number of cores: (nc <- detectCores()) ## Create clusters cl <- makeCluster(rep("localhost", nc)) ## Then do: refdist <- PBrefdist(sug, sug.h, nsim=1000, cl=cl) ## It is recommended to stop the clusters before quitting R: stopCluster(cl) ## End(Not run) lm1 <- lm(dist~speed+I(speed^2), data=cars) PBmodcomp(lm1, .~.-speed, cl=2) PBmodcomp(lm1, .~.-I(speed^2), cl=2)## Not run: (fm0 <- lmer(Reaction ~ (Days|Subject), sleepstudy)) (fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)) (fm2 <- lmer(Reaction ~ Days + I(Days^2) + (Days|Subject), sleepstudy)) NSIM <- 50 ## Simulations in parametric bootstrap; default is 1000. ## Test for no effect of Days in fm1, i.e. test fm0 under fm1 PBmodcomp(fm1, "Days", cl=1, nsim=NSIM) PBmodcomp(fm1, ~.-Days, cl=1, nsim=NSIM) L1 <- cbind(0, 1) PBmodcomp(fm1, L1, cl=1, nsim=NSIM) PBmodcomp(fm1, fm0, cl=1, nsim=NSIM) anova(fm1, fm0) ## Test for no effect of Days and Days-squared in fm2, i.e. test fm0 under fm2 PBmodcomp(fm2, "(Days+I(Days^2))", cl=1, nsim=NSIM) PBmodcomp(fm2, ~. - Days - I(Days^2), cl=1, nsim=NSIM) L2 <- rbind(c(0, 1, 0), c(0, 0, 1)) PBmodcomp(fm2, L2, cl=1, nsim=NSIM) ## FIXME PBmodcomp(fm2, fm0, cl=1, nsim=NSIM) anova(fm2, fm0) ## Test for no effect of Days-squared in fm2, i.e. test fm1 under fm2 PBmodcomp(fm2, "I(Days^2)", cl=1, nsim=NSIM) PBmodcomp(fm2, ~. - I(Days^2), cl=1, nsim=NSIM) L3 <- rbind(c(0, 0, 1)) PBmodcomp(fm2, L3, cl=1, nsim=NSIM) PBmodcomp(fm2, fm1, cl=1, nsim=NSIM) anova(fm2, fm1) ## Linear normal model: sug <- lm(sugpct ~ block + sow + harvest, data=beets) sug.h <- update(sug, .~. -harvest) sug.s <- update(sug, .~. -sow) PBmodcomp(sug, "harvest", nsim=NSIM, cl=1) PBmodcomp(sug, ~. - harvest, nsim=NSIM, cl=1) PBmodcomp(sug, sug.h, nsim=NSIM, cl=1) anova(sug, sug.h) ## Generalized linear model mm <- glm(ndead/ntotal ~ sex + log(dose), family=binomial, weight=ntotal, data=budworm) mm0 <- update(mm, .~. -sex) ### Test for no effect of sex PBmodcomp(mm, "sex", cl=1, nsim=NSIM) PBmodcomp(mm, ~.-sex, cl=1, nsim=NSIM) ## PBmodcomp(mm, cbind(0, 1, 0), nsim=NSIM): FIXME PBmodcomp(mm, mm0, cl=1, nsim=NSIM) anova(mm, mm0, test="Chisq") ## End(Not run) ## Generalized linear mixed model (it takes a while to fit these) ## Not run: (gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd), data = cbpp, family = binomial)) (gm2 <- update(gm1, .~.-period)) PBmodcomp(gm1, "period", nsim=NSIM) PBmodcomp(gm1, ~. -period, nsim=NSIM) PBmodcomp(gm1, gm2, nsim=NSIM) anova(gm1, gm2) ## End(Not run) ## Not run: ## Linear mixed effects model: sug <- lmer(sugpct ~ block + sow + harvest + (1|block:harvest), data=beets, REML=FALSE) sug.h <- update(sug, .~. -harvest) sug.s <- update(sug, .~. -sow) anova(sug, sug.h) PBmodcomp(sug, sug.h, nsim=NSIM, cl=1) PBmodcomp(sug, "harvest", nsim=NSIM, cl=1) anova(sug, sug.s) PBmodcomp(sug, sug.s, nsim=NSIM, cl=1) PBmodcomp(sug, "sow", nsim=NSIM, cl=1) ## Simulate reference distribution separately: refdist <- PBrefdist(sug, sug.h, nsim=1000, cl=1) refdist <- PBrefdist(sug, "harvest", nsim=1000, cl=1) refdist <- PBrefdist(sug, ~.-harvest, nsim=1000, cl=1) ## Do computations with multiple processors: ## Number of cores: (nc <- detectCores()) ## Create clusters cl <- makeCluster(rep("localhost", nc)) ## Then do: refdist <- PBrefdist(sug, sug.h, nsim=1000, cl=cl) ## It is recommended to stop the clusters before quitting R: stopCluster(cl) ## End(Not run) lm1 <- lm(dist~speed+I(speed^2), data=cars) PBmodcomp(lm1, .~.-speed, cl=2) PBmodcomp(lm1, .~.-I(speed^2), cl=2)
pb_refdist() simulates a fixed number of bootstrap replicates.
pb_refdist_sequential() runs batches of simulations until a target number of extreme
hits is reached (or a maximum number of simulations is hit), useful for efficient estimation
of small p-values.
Both functions return objects of class "PBrefdist" with methods for summary(), plot(), and
as.data.frame().
pb_refdist( fit1, fit0, nsim = 1000, engine = c("serial", "parallel", "future"), nworkers = 2, verbose = FALSE ) pb_refdist_sequential( fit1, fit0, h = 20, nsim = 1000, batch_size = 50, engine = c("serial", "parallel", "future"), nworkers = 2, verbose = FALSE )pb_refdist( fit1, fit0, nsim = 1000, engine = c("serial", "parallel", "future"), nworkers = 2, verbose = FALSE ) pb_refdist_sequential( fit1, fit0, h = 20, nsim = 1000, batch_size = 50, engine = c("serial", "parallel", "future"), nworkers = 2, verbose = FALSE )
fit1 |
The larger (alternative) model, e.g. |
fit0 |
The smaller (null) model, nested in |
nsim |
Number of simulations (for |
engine |
Computation engine: |
nworkers |
Number of workers for parallel/future backend. |
verbose |
Logical; if |
h |
Number of extreme hits to target (for |
batch_size |
Number of simulations per batch |
Compute a parametric bootstrap reference distribution for a likelihood ratio statistic, comparing a large (alternative) and a small (null) model. The distribution can be used to estimate a bootstrap p-value.
The sequential version is useful when one wants to control Monte Carlo error by targeting a fixed number of extreme values exceeding the observed test statistic.
An object of class "PBrefdist" containing the observed statistic, the bootstrap replicates,
degrees of freedom, asymptotic p-value, and optionally the number of hits and standard error.
Best Practice: Always fit your models with the data= argument.
This ensures all covariates used in the model formula are stored with the model object,
enabling reliable simulation and refitting for bootstrap analysis,
including on parallel workers. Without data=, refitting may fail in parallel contexts
and reproducibility is compromised.
The returned object can be passed to summary(), plot(), and as.data.frame().
Best Practice: Always fit your models with the data= argument.
This ensures all covariates used in the model formula are stored with the model object,
enabling reliable simulation and refitting for bootstrap analysis,
including on parallel workers. Without data=, refitting may fail in parallel contexts
and reproducibility is compromised.
The function automatically ensures that the models have their required data embedded. This guarantees that parametric bootstrap simulations can be run in parallel workers without errors about missing variables, even if the original dataset was modified or removed from the global environment after fitting.
Calculate reference distribution of likelihood ratio statistic in mixed effects models using parametric bootstrap
PBrefdist( largeModel, smallModel, nsim = 1000, seed = NULL, cl = NULL, details = 0 ) ## S3 method for class 'lm' PBrefdist( largeModel, smallModel, nsim = 1000, seed = NULL, cl = NULL, details = 0 ) ## S3 method for class 'merMod' PBrefdist( largeModel, smallModel, nsim = 1000, seed = NULL, cl = NULL, details = 0 ) ## S3 method for class 'gls' PBrefdist( largeModel, smallModel, nsim = 1000, seed = NULL, cl = NULL, details = 0 )PBrefdist( largeModel, smallModel, nsim = 1000, seed = NULL, cl = NULL, details = 0 ) ## S3 method for class 'lm' PBrefdist( largeModel, smallModel, nsim = 1000, seed = NULL, cl = NULL, details = 0 ) ## S3 method for class 'merMod' PBrefdist( largeModel, smallModel, nsim = 1000, seed = NULL, cl = NULL, details = 0 ) ## S3 method for class 'gls' PBrefdist( largeModel, smallModel, nsim = 1000, seed = NULL, cl = NULL, details = 0 )
largeModel |
A linear mixed effects model as fitted with the
|
smallModel |
A linear mixed effects model as fitted with the
|
nsim |
The number of simulations to form the reference distribution. |
seed |
Seed for the random number generation. |
cl |
Used for controlling parallel computations. See sections 'details' and 'examples' below. |
details |
The amount of output produced. Mainly relevant for debugging purposes. |
The model object must be fitted with maximum likelihood
(i.e. with REML=FALSE). If the object is fitted with restricted
maximum likelihood (i.e. with REML=TRUE) then the model is
refitted with REML=FALSE before the p-values are calculated. Put
differently, the user needs not worry about this issue.
The argument 'cl' (originally short for 'cluster') is used for controlling parallel computations. 'cl' can be NULL (default), positive integer or a list of clusters.
Special care must be taken on Windows platforms (described below) but the general picture is this:
The recommended way of controlling cl is to specify the
component \code{pbcl} in options() with
e.g. \code{options("pbcl"=4)}.
If cl is NULL, the function will look at if the pbcl has been set
in the options list with \code{getOption("pbcl")}
If cl=N then N cores will be used in the computations. If cl is
NULL then the function will look for
A numeric vector
Søren Højsgaard [email protected]
Ulrich Halekoh, Søren Højsgaard (2014)., A Kenward-Roger Approximation and Parametric Bootstrap Methods for Tests in Linear Mixed Models - The R Package pbkrtest., Journal of Statistical Software, 58(10), 1-30., https://www.jstatsoft.org/v59/i09/
data(beets) head(beets) beet0 <- lmer(sugpct ~ block + sow + harvest + (1|block:harvest), data=beets, REML=FALSE) beet_no.harv <- update(beet0, . ~ . -harvest) rd <- PBrefdist(beet0, beet_no.harv, nsim=20, cl=1) rd ## Not run: ## Note: Many more simulations must be made in practice. # Computations can be made in parallel using several processors: # 1: On OSs that fork processes (that is, not on windows): # -------------------------------------------------------- if (Sys.info()["sysname"] != "Windows"){ N <- 2 ## Or N <- parallel::detectCores() # N cores used in all calls to function in a session options("mc.cores"=N) rd <- PBrefdist(beet0, beet_no.harv, nsim=20) # N cores used just in one specific call (when cl is set, # options("mc.cores") is ignored): rd <- PBrefdist(beet0, beet_no.harv, nsim=20, cl=N) } # In fact, on Windows, the approach above also work but only when setting the # number of cores to 1 (so there is to parallel computing) # In all calls: # options("mc.cores"=1) # rd <- PBrefdist(beet0, beet_no.harv, nsim=20) # Just once # rd <- PBrefdist(beet0, beet_no.harv, nsim=20, cl=1) # 2. On all platforms (also on Windows) one can do # ------------------------------------------------ library(parallel) N <- 2 ## Or N <- detectCores() clus <- makeCluster(rep("localhost", N)) # In all calls in a session options("pb.cl"=clus) rd <- PBrefdist(beet0, beet_no.harv, nsim=20) # Just once: rd <- PBrefdist(beet0, beet_no.harv, nsim=20, cl=clus) stopCluster(clus) ## End(Not run)data(beets) head(beets) beet0 <- lmer(sugpct ~ block + sow + harvest + (1|block:harvest), data=beets, REML=FALSE) beet_no.harv <- update(beet0, . ~ . -harvest) rd <- PBrefdist(beet0, beet_no.harv, nsim=20, cl=1) rd ## Not run: ## Note: Many more simulations must be made in practice. # Computations can be made in parallel using several processors: # 1: On OSs that fork processes (that is, not on windows): # -------------------------------------------------------- if (Sys.info()["sysname"] != "Windows"){ N <- 2 ## Or N <- parallel::detectCores() # N cores used in all calls to function in a session options("mc.cores"=N) rd <- PBrefdist(beet0, beet_no.harv, nsim=20) # N cores used just in one specific call (when cl is set, # options("mc.cores") is ignored): rd <- PBrefdist(beet0, beet_no.harv, nsim=20, cl=N) } # In fact, on Windows, the approach above also work but only when setting the # number of cores to 1 (so there is to parallel computing) # In all calls: # options("mc.cores"=1) # rd <- PBrefdist(beet0, beet_no.harv, nsim=20) # Just once # rd <- PBrefdist(beet0, beet_no.harv, nsim=20, cl=1) # 2. On all platforms (also on Windows) one can do # ------------------------------------------------ library(parallel) N <- 2 ## Or N <- detectCores() clus <- makeCluster(rep("localhost", N)) # In all calls in a session options("pb.cl"=clus) rd <- PBrefdist(beet0, beet_no.harv, nsim=20) # Just once: rd <- PBrefdist(beet0, beet_no.harv, nsim=20, cl=clus) stopCluster(clus) ## End(Not run)
Compare two nested models using parametric bootstrap simulation of the likelihood ratio statistic. Supports models fitted via lm, lme4 (lmer/glmer), nlme (lme/gls), etc.
pb2_modcomp( fit1, fit0, nsim = 1000, sequential = FALSE, h = 20, engine = "serial", nworkers = 2, verbose = FALSE )pb2_modcomp( fit1, fit0, nsim = 1000, sequential = FALSE, h = 20, engine = "serial", nworkers = 2, verbose = FALSE )
fit1 |
The larger (alternative) model. |
fit0 |
The smaller (null) model. |
nsim |
Number of simulations. In fixed bootstrap: total number of simulations. In sequential bootstrap: maximum number of simulations allowed. |
sequential |
Logical; if TRUE, use sequential bootstrap sampling to reach target number of extreme hits. |
h |
Number of extreme hits to target in sequential sampling. |
engine |
Parallelisation engine: "serial", "parallel", or "future". |
nworkers |
Number of workers for parallel/future engine. |
verbose |
Logical; if TRUE, print progress messages. |
The models should both be fitted by maximum likelihood (not REML). If REML was used, the function will automatically refit with REML = FALSE where possible.
An object of class PBmodcomp, with print(),
summary(), and plot() methods.
Best Practice: Always fit your models with the data=
argument. This ensures all covariates used in the model
formula are stored with the model object, enabling reliable
simulation and refitting for bootstrap analysis, including on
parallel workers. Without data=, refitting may fail in
parallel contexts and reproducibility is compromised.
Re-estimates a model using the same formula and design but with a new response vector.
The refit generic allows simulation-based workflows (e.g.,
parametric bootstrap) where new synthetic responses are drawn from
a fitted model and the model is re-fitted with the same design
structure.
## S3 method for class 'lm' refit(object, newresp = NULL, ...) ## S3 method for class 'lme' refit(object, newresp = NULL, ...) ## S3 method for class 'gls' refit(object, newresp = NULL, ...)## S3 method for class 'lm' refit(object, newresp = NULL, ...) ## S3 method for class 'lme' refit(object, newresp = NULL, ...) ## S3 method for class 'gls' refit(object, newresp = NULL, ...)
object |
A fitted model object. Supported classes include |
newresp |
A numeric vector of new response values of the same length as the original data. |
... |
Additional arguments (currently ignored). |
A new fitted model object of the same class as the original.
refit.lmRefits a linear model with the same formula but a new response vector.
refit.lmeRefits a linear mixed-effects model (from nlme) with new response data.
refit.glsRefits a generalized least squares model (from nlme) with new response data.
if (require(nlme) && require(lme4)) { data(Orthodont, package = "nlme") # Fit models fit_lm <- lm(distance ~ age, data = Orthodont) fit_gls <- gls(distance ~ age, data = Orthodont) fit_lme <- lme(distance ~ age, random = ~ 1 | Subject, data = Orthodont) # Simulate new response vectors set.seed(123) new_y <- rnorm(nrow(Orthodont), mean = mean(Orthodont$distance), sd = sd(Orthodont$distance)) # Refit models with new response refit(fit_lm, newresp = new_y) refit(fit_gls, newresp = new_y) refit(fit_lme, newresp = new_y) }if (require(nlme) && require(lme4)) { data(Orthodont, package = "nlme") # Fit models fit_lm <- lm(distance ~ age, data = Orthodont) fit_gls <- gls(distance ~ age, data = Orthodont) fit_lme <- lme(distance ~ age, random = ~ 1 | Subject, data = Orthodont) # Simulate new response vectors set.seed(123) new_y <- rnorm(nrow(Orthodont), mean = mean(Orthodont$distance), sd = sd(Orthodont$distance)) # Refit models with new response refit(fit_lm, newresp = new_y) refit(fit_gls, newresp = new_y) refit(fit_lme, newresp = new_y) }
Computes residuals of the form , i.e., the difference between the observed response
and the fixed-effects prediction. These are also called population-level or marginal residuals.
For models without random effects (e.g., gls), this is equivalent to the usual residuals.
residuals0(object, ...)residuals0(object, ...)
object |
A fitted model object. Supported classes are |
... |
Not used. |
A numeric vector of residuals.
if (require(nlme)) { fit_lme <- lme(distance ~ age, random = ~1 | Subject, data = Orthodont) head(residuals0(fit_lme)) fit_gls <- gls(distance ~ age, data = Orthodont) head(residuals0(fit_gls)) } if (require(lme4)) { fit_lmer <- lmer(Reaction ~ Days + (1 | Subject), data = sleepstudy) head(residuals0(fit_lmer)) }if (require(nlme)) { fit_lme <- lme(distance ~ age, random = ~1 | Subject, data = Orthodont) head(residuals0(fit_lme)) fit_gls <- gls(distance ~ age, data = Orthodont) head(residuals0(fit_gls)) } if (require(lme4)) { fit_lmer <- lmer(Reaction ~ Days + (1 | Subject), data = sleepstudy) head(residuals0(fit_lmer)) }
An approximate F-test based on the Satterthwaite approach.
SATmodcomp( largeModel, smallModel, betaH = 0, details = 0, eps = sqrt(.Machine$double.eps) ) ## S3 method for class 'lmerMod' SATmodcomp( largeModel, smallModel, betaH = 0, details = 0, eps = sqrt(.Machine$double.eps) )SATmodcomp( largeModel, smallModel, betaH = 0, details = 0, eps = sqrt(.Machine$double.eps) ) ## S3 method for class 'lmerMod' SATmodcomp( largeModel, smallModel, betaH = 0, details = 0, eps = sqrt(.Machine$double.eps) )
largeModel |
An |
smallModel |
An |
betaH |
A number or a vector of the beta of the hypothesis,
e.g. L beta=L betaH. If |
details |
If larger than 0 some timing details are printed. |
eps |
A small number. |
Notice: It cannot be guaranteed that the results agree with other implementations of the Satterthwaite approach!
Søren Højsgaard, [email protected]
Ulrich Halekoh, Søren Højsgaard (2014)., A Kenward-Roger Approximation and Parametric Bootstrap Methods for Tests in Linear Mixed Models - The R Package pbkrtest., Journal of Statistical Software, 58(10), 1-30., https://www.jstatsoft.org/v59/i09/
getKR, lmer, vcovAdj,
PBmodcomp, KRmodcomp
(fm0 <- lmer(Reaction ~ (Days|Subject), sleepstudy)) (fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)) (fm2 <- lmer(Reaction ~ Days + I(Days^2) + (Days|Subject), sleepstudy)) ## Test for no effect of Days in fm1, i.e. test fm0 under fm1 SATmodcomp(fm1, "Days") SATmodcomp(fm1, ~.-Days) L1 <- cbind(0, 1) ## SATmodcomp(fm1, L1) ## FIXME SATmodcomp(fm1, fm0) anova(fm1, fm0) ## Test for no effect of Days and Days-squared in fm2, i.e. test fm0 under fm2 SATmodcomp(fm2, "(Days+I(Days^2))") SATmodcomp(fm2, ~. - Days - I(Days^2)) L2 <- rbind(c(0, 1, 0), c(0, 0, 1)) ## SATmodcomp(fm2, L2) ## FIXME SATmodcomp(fm2, fm0) anova(fm2, fm0) ## Test for no effect of Days-squared in fm2, i.e. test fm1 under fm2 SATmodcomp(fm2, "I(Days^2)") SATmodcomp(fm2, ~. - I(Days^2)) L3 <- rbind(c(0, 0, 1)) ## SATmodcomp(fm2, L3) ## FIXME SATmodcomp(fm2, fm1) anova(fm2, fm1)(fm0 <- lmer(Reaction ~ (Days|Subject), sleepstudy)) (fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)) (fm2 <- lmer(Reaction ~ Days + I(Days^2) + (Days|Subject), sleepstudy)) ## Test for no effect of Days in fm1, i.e. test fm0 under fm1 SATmodcomp(fm1, "Days") SATmodcomp(fm1, ~.-Days) L1 <- cbind(0, 1) ## SATmodcomp(fm1, L1) ## FIXME SATmodcomp(fm1, fm0) anova(fm1, fm0) ## Test for no effect of Days and Days-squared in fm2, i.e. test fm0 under fm2 SATmodcomp(fm2, "(Days+I(Days^2))") SATmodcomp(fm2, ~. - Days - I(Days^2)) L2 <- rbind(c(0, 1, 0), c(0, 0, 1)) ## SATmodcomp(fm2, L2) ## FIXME SATmodcomp(fm2, fm0) anova(fm2, fm0) ## Test for no effect of Days-squared in fm2, i.e. test fm1 under fm2 SATmodcomp(fm2, "I(Days^2)") SATmodcomp(fm2, ~. - I(Days^2)) L3 <- rbind(c(0, 0, 1)) ## SATmodcomp(fm2, L3) ## FIXME SATmodcomp(fm2, fm1) anova(fm2, fm1)
Simulates response vectors from the marginal distribution implied by a fitted model. For mixed models, this means integrating over random effects (i.e. predicting at level 0).
simulate0(object, nsim = 1, seed = NULL, ...)simulate0(object, nsim = 1, seed = NULL, ...)
object |
A fitted model object. Supports objects of class |
nsim |
Number of simulated response vectors to generate. Default is 1. |
seed |
Optional seed for reproducibility. |
... |
Currently ignored. |
For gls models, the mean is the fitted values.
For lme models, the mean is the level-0 prediction (no random effects).
For lmerMod models, the mean is the marginal mean with random effects set to zero (re.form=NA).
The covariance matrix is obtained via cov_matrix(), representing the implied marginal covariance structure.
A data frame with length(mu) rows and nsim columns. Each column is one simulated response vector.
if (require(nlme) && require(lme4) && require(MASS)) { data(Orthodont, package = "nlme") NSIM <- 6 # GLS model fit_gls <- gls(distance ~ age, data = Orthodont) sim_gls <- simulate0(fit_gls, nsim = NSIM, seed = 123) head(sim_gls) # LME model fit_lme <- lme(distance ~ age, random = ~1 | Subject, data = Orthodont) sim_lme <- simulate0(fit_lme, nsim = NSIM, seed = 123) head(sim_lme) # lmerMod model fit_lmer <- lme4::lmer(distance ~ age + (1 | Subject), data = Orthodont) sim_lmer <- simulate0(fit_lmer, nsim = NSIM, seed = 123) head(sim_lmer) # lm fit_lm <- lm(distance ~ age, data = Orthodont) sim_lm <- simulate0(fit_lm, nsim = NSIM, seed = 123) head(sim_lm) }if (require(nlme) && require(lme4) && require(MASS)) { data(Orthodont, package = "nlme") NSIM <- 6 # GLS model fit_gls <- gls(distance ~ age, data = Orthodont) sim_gls <- simulate0(fit_gls, nsim = NSIM, seed = 123) head(sim_gls) # LME model fit_lme <- lme(distance ~ age, random = ~1 | Subject, data = Orthodont) sim_lme <- simulate0(fit_lme, nsim = NSIM, seed = 123) head(sim_lme) # lmerMod model fit_lmer <- lme4::lmer(distance ~ age + (1 | Subject), data = Orthodont) sim_lmer <- simulate0(fit_lmer, nsim = NSIM, seed = 123) head(sim_lmer) # lm fit_lm <- lm(distance ~ age, data = Orthodont) sim_lm <- simulate0(fit_lm, nsim = NSIM, seed = 123) head(sim_lm) }
Calculates asymptotic p-value, bootstrap p-value with CI, Bartlett corrections (mean and median-based) and F-approximations (mean and median-based).
summarize_pb(LRTstat, ref, conf.level = 0.95)summarize_pb(LRTstat, ref, conf.level = 0.95)
LRTstat |
Numeric vector: c(tobs, df) = observed test statistic and df. |
ref |
Numeric vector of simulated bootstrap test statistics. |
conf.level |
Confidence level for bootstrap p CI (default 0.95). |
List with results table, moments, samples info, CI, SE, etc.
Tidy method for nlme::gls objects
## S3 method for class 'gls' tidy(x, conf.int = FALSE, conf.level = 0.95, exponentiate = FALSE, ...)## S3 method for class 'gls' tidy(x, conf.int = FALSE, conf.level = 0.95, exponentiate = FALSE, ...)
x |
A fitted model of class "gls" (from nlme). |
conf.int |
Logical; include confidence intervals? |
conf.level |
Confidence level for intervals. |
exponentiate |
Logical; return exp(estimate) etc.? |
... |
Unused (for S3 compatibility). |
data.frame with columns term, estimate, std.error, statistic, p.value and optionally conf.low, conf.high.
Søren Højsgaard, [email protected]
Chisq test
X2modcomp(largeModel, smallModel, betaH = 0, details = 0, ...) ## Default S3 method: X2modcomp(largeModel, smallModel, betaH = 0, details = 0, ...)X2modcomp(largeModel, smallModel, betaH = 0, details = 0, ...) ## Default S3 method: X2modcomp(largeModel, smallModel, betaH = 0, details = 0, ...)
largeModel |
An |
smallModel |
An |
betaH |
A number or a vector of the beta of the hypothesis,
e.g. L beta=L betaH. If |
details |
If larger than 0 some timing details are printed. |
... |
Additional arguments, currently not used. |
Ulrich Halekoh [email protected], Søren Højsgaard [email protected]
(fm0 <- lmer(Reaction ~ (Days|Subject), sleepstudy)) (fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)) (fm2 <- lmer(Reaction ~ Days + I(Days^2) + (Days|Subject), sleepstudy))
X2modcomp(fm1, "Days") X2modcomp(fm1, ~.-Days) X2modcomp(fm1, fm0) anova(fm1, fm0)
L1 <- cbind(0, 1) X2modcomp(fm1, L1) ## FIXME