Title: | Solve Generalized Estimating Equations for Clustered Data |
---|---|
Description: | Estimation of generalized linear models with correlated/clustered observations by use of generalized estimating equations (GEE). See e.g. Halekoh and Højsgaard, (2005, <doi:10.18637/jss.v015.i02>), for details. Several types of clustering are supported, including exchangeable variance structures, AR1 structures, M-dependent, user-specified variance structures and more. The model fitting computations are performed using modified code from the 'geeM' package, while the interface and output objects have been written to resemble the 'geepack' package. The package also contains additional tools for working with and inspecting results from the 'geepack' package, e.g. a 'confint' method for 'geeglm' objects from 'geepack'. |
Authors: | Anne Helby Petersen [aut], Lee McDaniel [aut] (Author of geeM), Claus Ekstrøm [ctb] (Wrote code for drop1 methods), Søren Højsgaard [aut, cre] (Author of geepack) |
Maintainer: | Søren Højsgaard <[email protected]> |
License: | GPL-3 |
Version: | 0.1.2 |
Built: | 2025-01-25 04:04:25 UTC |
Source: | https://github.com/cran/geeasy |
Compute Wald confidence intervals for mean structure parameters of geelm object.
## S3 method for class 'geelm' confint(object, parm = NULL, level = 0.95, std.err = "san.se", ...) ## S3 method for class 'geeglm' confint(object, parm = NULL, level = 0.95, std.err = "san.se", ...)
## S3 method for class 'geelm' confint(object, parm = NULL, level = 0.95, std.err = "san.se", ...) ## S3 method for class 'geeglm' confint(object, parm = NULL, level = 0.95, std.err = "san.se", ...)
object |
a fitted model object. |
parm |
specification of which parameters are to be given confidence intervals, either a vector of numbers or a vector of names. If missing, all parameters are considered. |
level |
the confidence level required. |
std.err |
Which standard error estimation method that should
be used for computing the confidence intervals. Only |
... |
additional argument(s) for methods. |
A matrix (or vector) with columns giving lower and upper confidence limits for each parameter.
confint(geeglm)
:
geeglm
Model Using Wald or Score TestCompute all the single terms in the scope argument that can dropped from the model, and compute a table of the corresponding Wald test statistics.
## S3 method for class 'geeglm' drop1( object, scope, test = c("Wald", "none", "score", "sasscore"), method = c("robust", "naive", "sandwich"), ... )
## S3 method for class 'geeglm' drop1( object, scope, test = c("Wald", "none", "score", "sasscore"), method = c("robust", "naive", "sandwich"), ... )
object |
a fitted object of class geese. |
scope |
a formula giving the terms to be considered for adding or dropping. |
test |
the type of test to include. |
method |
Indicates which method is used for computing the
standard error. |
... |
other arguments. Not currently used |
An object of class anova
summarizing the differences in
fit between the models.
Claus Ekstrom [email protected]
drop1
, geeglm
, geese
library(geepack) data(ohio) fit <- geeglm(resp ~ age + smoke + age:smoke, id=id, data=ohio, family=binomial, corstr="exch", scale.fix=TRUE) drop1(fit)
library(geepack) data(ohio) fit <- geeglm(resp ~ age + smoke + age:smoke, id=id, data=ohio, family=binomial, corstr="exch", scale.fix=TRUE) drop1(fit)
geem
Model Using Wald or Score TestCompute all the single terms in the scope argument that can dropped from the model, and compute a table of the corresponding Wald test statistics.
## S3 method for class 'geem' drop1( object, scope, test = c("Wald", "none", "score", "sasscore"), method = c("robust", "naive", "sandwich"), ... )
## S3 method for class 'geem' drop1( object, scope, test = c("Wald", "none", "score", "sasscore"), method = c("robust", "naive", "sandwich"), ... )
object |
a fitted object of class geese. |
scope |
a formula giving the terms to be considered for adding or dropping. |
test |
the type of test to include. |
method |
Indicates which method is used for computing the
standard error. |
... |
other arguments. Not currently used |
An object of class anova
summarizing the differences in
fit between the models.
Claus Ekstrom [email protected]
drop1
, geem
library(geeM) library(geepack) data(ohio) fit <- geem(resp ~ age + smoke + age:smoke, id=id, data=ohio, family="binomial", corstr="exch", scale.fix=TRUE) drop1(fit)
library(geeM) library(geepack) data(ohio) fit <- geem(resp ~ age + smoke + age:smoke, id=id, data=ohio, family="binomial", corstr="exch", scale.fix=TRUE) drop1(fit)
Settings for controlling technical details of GEE fitting via geelm.
geelm.control( init.beta = NULL, init.phi = 1, tol = 1e-05, maxit = 20, scale.fix = FALSE, useP = TRUE, std.err = "san.se" )
geelm.control( init.beta = NULL, init.phi = 1, tol = 1e-05, maxit = 20, scale.fix = FALSE, useP = TRUE, std.err = "san.se" )
init.beta |
an optional vector with the initial values of beta. If not specified, then the intercept will be set to |
init.phi |
an optional initial overdispersion parameter. If not supplied, initialized to 1. |
tol |
Tolerance for asserting convergence. |
maxit |
integer giving the maximal number of Fisher Scoring iteration. |
scale.fix |
logical indicating if the scale should be fixed. |
useP |
If set to |
std.err |
Character string specifying which standard error
estimation method should be used. Supported options are
|
A list of values used for controlling model fitting.
Estimate mean structure parameters and their corresponding standard errors for generalized linear models with clustered or correlated observations by use of generalized estimating equations.
geelm.fit(x, y, id, offset, family, weights, control, corstr, start = NULL) geelm( formula, id = NULL, waves = NULL, data = parent.frame(), family = gaussian, corstr = "independence", Mv = 1, weights = NULL, corr.mat = NULL, offset = NULL, engine = "geeasy", output = "geelm", control = geelm.control() )
geelm.fit(x, y, id, offset, family, weights, control, corstr, start = NULL) geelm( formula, id = NULL, waves = NULL, data = parent.frame(), family = gaussian, corstr = "independence", Mv = 1, weights = NULL, corr.mat = NULL, offset = NULL, engine = "geeasy", output = "geelm", control = geelm.control() )
x , y
|
For For |
id |
A vector identifying the clusters. If NULL, then each observation is assigned its own cluster. |
offset |
this can be used to specify an a priori known
component to be included in the linear predictor during fitting.
This should be |
family |
A description of the error distribution and link function to be used
in the model. The argument can be one of three options: a |
weights |
an optional vector of ‘prior weights’ to be used
in the fitting process. Should be |
control |
A list of parameters for controlling the fitting process. |
corstr |
A character string specifying the correlation structure.
The default is "independence". Allowed structures are: |
start |
starting values for the parameters in the linear predictor. |
formula |
A formula expression similar to that for |
waves |
An numeric vector identifying the time ordering within clusters
(i.e. levels of |
data |
An optional data frame containing the variables in the model. |
Mv |
For |
corr.mat |
The correlation matrix for |
engine |
Engine used to fit the model. The default, |
output |
Output object type. There are two options; 1) |
Users may specify functions for link and variance functions, but the functions must be vectorized functions.
Offsets can be specified in the model formula, as in glm()
or they may be
specified using the offset
argument. If offsets are specified in both ways,
their sum is used as an offset.
For the "userdefined"
correlation option, the function accepts a
matrix with consecutive integers. Each such integer represent a distinct
parameter that will be estimated. All entries given as 1 will be assumed
to be the same as each other and will be assumed to be possibly different
from entries with a 2, and so on.geelm
only looks at the upper
triangle of the matrix. Any entry given as 0 will be fixed at 0.
If observations are dropped because they have a weight of 0, then the
denominator for the moment estimates of the correlation matrices are
calculated using the number of non-zero Pearson residuals for the
correlation structures unstructured
, userdefined
and
m-dependent
with Mv>1
. Therefore, residuals numerically
equal to 0 may cause problems in the calculation of correlation parameters.
Concerning the family
argument: If the supplied argument is a character
string, then the string should correspond to one of the family objects.
In order to define a link function, a list must be created with the
components (LinkFun, VarFun, InvLink, InvLinkDeriv)
, all of which are
vectorized functions. If the components in the list are not named
as (LinkFun, VarFun, InvLink, InvLinkDeriv)
, then geelm
assumes that the functions are given in that order. LinkFun and VarFun
are the link and variance functions. InvLink and InvLinkDeriv are the inverse
of the link function and the derivative of the inverse of the link function
and so are decided by the choice of the link function.
An object of class geelm
(inherits from geeglm
) representing the fit.
It contains the following slots:
$coefficients
: Coefficients from the mean structure model (betas) on their
original scales
$residuals
: Pearson residuals, in the order of the inputted dataset (with NAs omitted).
$fitted.values
: Fitted values (response scale), in the order of the inputted dataset
(with NAs omitted).
$rank
: The rank of the model matrix, i.e. the number of estimated mean structure
coefficients.
$qr
: QR decomposition of the model matrix (NA omitted).
$family
: A family object specifying which exponential family was used for fitting
the mean structure model, see family
for more information.
$linear.predictors
: The linear predictor on the original scale.
$weights
: Weights used for computations, in the order of the inputted dataset
(NAs omitted).
$prior.weights
: The original weights used to produce this geeglm object (set
by user or defaulted to 1 for all observations).
$df.residuals
: Residual degrees of freedom.
$y
: Outcome variable, in the order of the inputted dataset (NAs omitted).
$model
: The model.frame, ordered as the original inputted data with NAs omitted.
$call
: The original function call that produced this geeglm object.
$formula
: The formula used in the original call.
$terms
: The terms of the formula used in the original call.
$data
: The original dataset that was used for producing this geeglm object.
$offset
: Offset used for fitting the model, ordered as the original inputted data
with NAs omitted.
$control
: Value of control parameters used for fitting the model.
$method
: Internal function used for fitting the model.
$contrasts
: Contrasts used in the model matrix.
$xlevels
: Levels of factor variables used in the model formula (if any).
$geese
: An object containing further information about the variance estimation,
including a variance matrix for the beta-coefficients ($vbeta
), the estimated
coefficients for the working correlation matrix ($alpha
), the estimated dispersion
parameter ($gamma
), and the individual cluster sizes ($clusz
). See
geese
for more information.
$modelInfo
: Information about the link functions used for fitting the mean, variance
and scale structures of the model.
$id
: IDs used for identifying the clusters, ordered as the original inputted data
with NAs omitted.
$corstr
: Name of the correlation structured imposed on the model. If the
correlation structure requires further information, it is stored in a suitably named
attribute. For example, for m-dependent correlation structures, the m scalar is available
in an attribute named Mv
.
$cor.link
: Link function used for the correlation structure.
$std.err
: Method used to estimate the standard error of the mean structure
coefficients (betas).
geelm.fit()
:
Anne Helby Petersen, Lee McDaniel & Nick Henderson
# load data data("respiratory") respiratory$useid <- interaction(respiratory$center, respiratory$id) # fit model m <- geelm(outcome ~ treat + sex + age + baseline, data = respiratory, id = useid, family = "binomial", corstr = "exchangeable") ## Not run: get_jack_se <- function(object, dat){ parm <- sapply(1:nrow(dat), function(i){ dat.i <- dat[-i,] coef(update(object, data=dat.i)) }) parm <- t(parm) parm.mean <- apply(parm, 2, mean) parm.cent <- sapply(1:nrow(parm), function(i){ parm[i, ] - parm.mean }) parm.cent <- t(parm.cent) jack.var <- ((nrow(dat)-1) / nrow(dat)) * t(parm.cent) %*% parm.cent jack.se <- sqrt(diag(jack.var)) jack.se } # load data data("respiratory") respiratory$useid <- interaction(respiratory$center, respiratory$id) # fit model obj <- geelm(outcome ~ treat + sex + age + baseline, data = respiratory, id = useid, family = "binomial", corstr = "exchangeable") dat <- respiratory get_jack_se(obj, dat) summary(obj) |> coef() ## End(Not run)
# load data data("respiratory") respiratory$useid <- interaction(respiratory$center, respiratory$id) # fit model m <- geelm(outcome ~ treat + sex + age + baseline, data = respiratory, id = useid, family = "binomial", corstr = "exchangeable") ## Not run: get_jack_se <- function(object, dat){ parm <- sapply(1:nrow(dat), function(i){ dat.i <- dat[-i,] coef(update(object, data=dat.i)) }) parm <- t(parm) parm.mean <- apply(parm, 2, mean) parm.cent <- sapply(1:nrow(parm), function(i){ parm[i, ] - parm.mean }) parm.cent <- t(parm.cent) jack.var <- ((nrow(dat)-1) / nrow(dat)) * t(parm.cent) %*% parm.cent jack.se <- sqrt(diag(jack.var)) jack.se } # load data data("respiratory") respiratory$useid <- interaction(respiratory$center, respiratory$id) # fit model obj <- geelm(outcome ~ treat + sex + age + baseline, data = respiratory, id = useid, family = "binomial", corstr = "exchangeable") dat <- respiratory get_jack_se(obj, dat) summary(obj) |> coef() ## End(Not run)
Get information for a geelm/geeglm object
getGEE(object, name, ...) getME(object, name, ...)
getGEE(object, name, ...) getME(object, name, ...)
object |
A geeglm model object as obtained from |
name |
Name of the slot/component of the geelm/geeglm object that should be returned. See list of possible names in details below. |
... |
Any additional arguments passed on to other functions. |
The allowed names are:
coefficients
or beta
: Coefficients from the mean
structure model (betas) on their original scales
residuals
: Pearson residuals, in the order of the inputted
dataset (with NAs omitted).
fitted.values
: Fitted values (response scale), in the order
of the inputted dataset (with NAs omitted).
rank
: The rank of the model matrix, i.e. the number of
estimated mean structure coefficients.
family
: A family object specifying which exponential family
was used for fitting the mean structure model, see
family
for more information.
linear.predictors
: The linear predictor on the original
scale.
df.residuals
: Residual degrees of freedom.
corstr
: Name of the correlation structured imposed on the
model. If the correlation structure requires further information,
it is stored in a suitably named attribute. For example, for
m-dependent correlation structures, the m scalar is available in an
attribute named Mv
.
std.err
: Method used to estimate the standard error of the
mean structure coefficients (betas).
alpha
: The estimated parameter(s) for the variance
structure.
gamma
or dispersion
: The estimated dispersion
parameter.
vbeta
: The estimated variance matrix of the mean structure
(beta) coefficients.
beta.se
: The standard errors of the mean structure (beta)
coefficients.
clusz
or clustersizes
: Sizes of each of the clusters
in the data.
nclusters
: The total number of clusters.
The requested slot of the object.
getME()
:
Parameter estimates are plotted along with error bars indicating 95 pct confidence intervals.
plotEst(..., intercept = TRUE, par = NULL, colors = NULL)
plotEst(..., intercept = TRUE, par = NULL, colors = NULL)
... |
One or more models. Currently, |
intercept |
Logical indicating whether the intercept should be plotted (defaults
to |
par |
Which model parameters to plot estimates for, given as character strings.
Default is |
colors |
Color scale to use if several models are plotted. Defaults to a color blind friendly scale. If there are more models than there are colors, the color values are repeated. |
One or more models can be supplied, and if the parameters have the same names across models, they will be grouped together allowing for easy comparison.
Note that models can be given with or without names. If names are supplied (see example below), these are printed in the plot legend. Otherwise, the name of the model object is printed there instead.
Implementation details: As a default, the estimates are extracted
from the $coefficients
slot from the model object and
confidence intervals are computed by calling confint()
. This
means that plotEst
supports all models that have a
coefficents
slot and a confint
method.
No return values; called for side effects.
Anne Helby Petersen
# Fit example models data(iris) m1 <- lm(Sepal.Length ~ Petal.Length, iris) m2 <- lm(Sepal.Length ~ Petal.Length + Petal.Width, iris) # Plot one model plotEst(m2) # Plot two models with default model labels plotEst(m1, m2) # Plot two models with custom model labels (simple) plotEst(model1 = m1, model2 = m2) # Plot two models with custom model labels (with spacing) plotEst(`Simple model` = m1, `Full petal model` = m2) # Plot two models without intercept plotEst(m1, m2, intercept = FALSE) # Plot two models with custom parameter subset plotEst(m1, m2, par = c("Petal.Length")) # Plot two models with custom color scale given by color names plotEst(m1, m2, colors = c("red", "blue")) # Plot two models with custom color scale given by color hex codes # note: only the first colors are used as there are more # colors than models plotEst(m1, m2, colors = c("#CC6666", "#9999CC", "#66CC99"))
# Fit example models data(iris) m1 <- lm(Sepal.Length ~ Petal.Length, iris) m2 <- lm(Sepal.Length ~ Petal.Length + Petal.Width, iris) # Plot one model plotEst(m2) # Plot two models with default model labels plotEst(m1, m2) # Plot two models with custom model labels (simple) plotEst(model1 = m1, model2 = m2) # Plot two models with custom model labels (with spacing) plotEst(`Simple model` = m1, `Full petal model` = m2) # Plot two models without intercept plotEst(m1, m2, intercept = FALSE) # Plot two models with custom parameter subset plotEst(m1, m2, par = c("Petal.Length")) # Plot two models with custom color scale given by color names plotEst(m1, m2, colors = c("red", "blue")) # Plot two models with custom color scale given by color hex codes # note: only the first colors are used as there are more # colors than models plotEst(m1, m2, colors = c("#CC6666", "#9999CC", "#66CC99"))
Internal functions for the MESS package
scorefct(object, beta = NULL, testidx = NULL, sas = FALSE)
scorefct(object, beta = NULL, testidx = NULL, sas = FALSE)
object |
input geepack object from a geeglm fit. |
beta |
The estimated parameters. If set to |
testidx |
Indices of the beta parameters that should be tested equal to zero |
sas |
Logical. Should the SAS version of the score test be
computed. Defaults to |
Claus Ekstrom [email protected]