R/margins.R
, R/margins_betareg.R
, R/margins_clm.R
, and 11 more
margins.Rd
This package is an R port of Stata's margins command, implemented as an S3 generic margins()
for model objects, like those of class “lm” and “glm”. margins()
is an S3 generic function for building a “margins” object from a model object. Methods are currently implemented for several model classes (see Details, below).
The package also provides a low-level function, marginal_effects
, to estimate those quantities and return a data frame of unit-specific effects and another even lower-level function, dydx
, to provide variable-specific derivatives from models. Some of the underlying architecture for the package is provided by the low-level function prediction
, which provides a consistent data frame interface to predict
for a large number of model types. If a prediction
method exists for a model class, margin
should work for the model class but only those classes listed here have been tested and specifically supported.
margins(model, ...) # S3 method for betareg margins(model, data = find_data(model, parent.frame()), variables = NULL, at = NULL, type = c("response", "link"), vcov = stats::vcov(model, phi = FALSE), vce = c("delta", "simulation", "bootstrap", "none"), iterations = 50L, unit_ses = FALSE, eps = 1e-07, ...) # S3 method for clm margins(model, data = find_data(model, parent.frame()), variables = NULL, at = NULL, type = c("response", "link"), vce = "none", eps = 1e-07, ...) # S3 method for default margins(model, data = find_data(model, parent.frame()), variables = NULL, at = NULL, type = c("response", "link"), vcov = stats::vcov(model), vce = c("delta", "simulation", "bootstrap", "none"), iterations = 50L, unit_ses = FALSE, eps = 1e-07, ...) # S3 method for glm margins(model, data = find_data(model, parent.frame()), variables = NULL, at = NULL, type = c("response", "link"), vcov = stats::vcov(model), vce = c("delta", "simulation", "bootstrap", "none"), iterations = 50L, unit_ses = FALSE, eps = 1e-07, ...) # S3 method for lm margins(model, data = find_data(model, parent.frame()), variables = NULL, at = NULL, type = c("response", "link"), vcov = stats::vcov(model), vce = c("delta", "simulation", "bootstrap", "none"), iterations = 50L, unit_ses = FALSE, eps = 1e-07, ...) # S3 method for loess margins(model, data, variables = NULL, at = NULL, vce = "none", eps = 1e-07, ...) # S3 method for merMod margins(model, data = find_data(model), variables = NULL, at = NULL, type = c("response", "link"), vcov = stats::vcov(model), vce = c("delta", "simulation", "bootstrap", "none"), iterations = 50L, unit_ses = FALSE, eps = 1e-07, ...) # S3 method for lmerMod margins(model, data = find_data(model), variables = NULL, at = NULL, type = c("response", "link"), vcov = stats::vcov(model), vce = c("delta", "simulation", "bootstrap", "none"), iterations = 50L, unit_ses = FALSE, eps = 1e-07, ...) # S3 method for multinom margins(model, data = find_data(model, parent.frame()), variables = NULL, at = NULL, type = NULL, vcov = stats::vcov(model), vce = c("delta", "simulation", "bootstrap", "none"), iterations = 50L, unit_ses = FALSE, eps = 1e-07, ...) # S3 method for nnet margins(model, data = find_data(model, parent.frame()), variables = NULL, at = NULL, vce = "none", eps = 1e-07, ...) # S3 method for polr margins(model, data = find_data(model, parent.frame()), variables = NULL, at = NULL, type = NULL, vcov = stats::vcov(model), vce = c("delta", "simulation", "bootstrap", "none"), iterations = 50L, unit_ses = FALSE, eps = 1e-07, ...) margins_summary(model, ..., level = 0.95, by_factor = TRUE) # S3 method for svyglm margins(model, data = find_data(model, parent.frame()), design, variables = NULL, at = NULL, type = c("response", "link"), vcov = stats::vcov(model), vce = c("delta", "simulation", "bootstrap", "none"), iterations = 50L, unit_ses = FALSE, eps = 1e-07, ...)
model | A model object. See Details for supported model classes. |
---|---|
… | Arguments passed to methods, and onward to |
data | A data frame containing the data at which to evaluate the marginal effects, as in |
variables | A character vector with the names of variables for which to compute the marginal effects. The default ( |
at | A list of one or more named vectors, specifically values at which to calculate the marginal effects. This is an analogue of Stata's |
type | A character string indicating the type of marginal effects to estimate. Mostly relevant for non-linear models, where the reasonable options are “response” (the default) or “link” (i.e., on the scale of the linear predictor in a GLM). |
vcov | A matrix containing the variance-covariance matrix for estimated model coefficients, or a function to perform the estimation with |
vce | A character string indicating the type of estimation procedure to use for estimating variances. The default (“delta”) uses the delta method. Alternatives are “bootstrap”, which uses bootstrap estimation, or “simulation”, which averages across simulations drawn from the joint sampling distribution of model coefficients. The latter two are extremely time intensive. |
iterations | If |
unit_ses | If |
eps | A numeric value specifying the “step” to use when calculating numerical derivatives. |
level | A numeric value specifying the confidence level for calculating p-values and confidence intervals. |
by_factor | A logical specifying whether to order the output by factor (the default, |
design | Only for models estimated using |
A data frame of class “margins” containing the contents of data
, predicted values from model
for data
, the standard errors of the predictions, and any estimated marginal effects. If at = NULL
(the default), then the data frame will have a number of rows equal to nrow(data)
. Otherwise, the number of rows will be a multiple thereof based upon the number of combinations of values specified in at
. Columns containing marginal effects are distinguished by their name (prefixed by dydx_
). These columns can be extracted from a “margins” object using, for example, marginal_effects(margins(model))
. Columns prefixed by Var_
specify the variances of the average marginal effects, whereas (optional) columns prefixed by SE_
contain observation-specific standard errors. A special column, _at_number
, specifies which at
combination a given row corresponds to; the data frame carries an attribute “at” that specifies which combination of values this index represents. The summary.margins()
method provides for pretty printing of the results, particularly in cases where at
is specified. A variance-covariance matrix for the average marginal effects is returned as an attribute (though behavior when at
is non-NULL is unspecified).
Methods for this generic return a “margins” object, which is a data frame consisting of the original data, predicted values and standard errors thereof, estimated marginal effects from the model model
(for all variables used in the model, or the subset specified by variables
), along with attributes describing various features of the marginal effects estimates.
The default print method is concise; a more useful summary
method provides additional details.
margins_summary
is sugar that provides a more convenient way of obtaining the nested call: summary(margins(...))
.
Methods are currently implemented for the following object classes:
“betareg”, see betareg
“ivreg”, see ivreg
“lm”, see lm
“loess”, see loess
“nnet”, see nnet
“polr”, see polr
“svyglm”, see svyglm
The margins
methods simply construct a list of data frames based upon the values of at
(using build_datalist
), calculate marginal effects for each data frame (via marginal_effects
and, in turn, dydx
and prediction
), stacks the results together, and provides variance estimates. Alternatively, you can use marginal_effects
directly to only retrieve a data frame of marginal effects without constructing a “margins” object or variance estimates. That can be efficient for plotting, etc., given the time-consuming nature of variance estimation.
See dydx
for details on estimation of marginal effects.
The choice of vce
may be important. The default variance-covariance estimation procedure (vce = "delta"
) uses the delta method to estimate marginal effect variances. This is the fastest method. When vce = "simulation"
, coefficient estimates are repeatedly drawn from the asymptotic (multivariate normal) distribution of the model coefficients and each draw is used to estimate marginal effects, with the variance based upon the dispersion of those simulated effects. The number of iterations used is given by iterations
. For vce = "bootstrap"
, the bootstrap is used to repeatedly subsample data
and the variance of marginal effects is estimated from the variance of the bootstrap distribution. This method is markedly slower than the other two procedures. Again, iterations
regulates the number of bootstrap subsamples to draw. Some model classes (notably “loess”) fix vce ="none"
.
Greene, W.H. 2012. Econometric Analysis, 7th Ed. Boston: Pearson.
Stata manual: margins
. Retrieved 2014-12-15 from http://www.stata.com/manuals13/rmargins.pdf.
# basic example using linear model require("datasets") x <- lm(mpg ~ cyl * hp + wt, data = head(mtcars)) margins(x)#>#>#> cyl hp wt #> -11.81 0.6845 0.6995# obtain unit-specific standard errors# NOT RUN { margins(x, unit_ses = TRUE) # }# use of 'variables' argument to estimate only some MEs summary(margins(x, variables = "hp"))#> factor AME SE z p lower upper #> hp 0.6845 0.0489 13.9991 0.0000 0.5887 0.7804# use of 'at' argument ## modifying original data values margins(x, at = list(hp = 150))#>#>#> at(hp) cyl hp wt #> 150 -18.53 0.6845 0.6995## AMEs at various data values margins(x, at = list(hp = c(95, 150), cyl = c(4,6)))#>#>#> at(hp) at(cyl) cyl hp wt #> 95 4 -7.266 1.0942 0.6995 #> 150 4 -18.531 1.0942 0.6995 #> 95 6 -7.266 0.6845 0.6995 #> 150 6 -18.531 0.6845 0.6995# use of 'data' argument to obtain AMEs for a subset of data margins(x, data = mtcars[mtcars[["cyl"]] == 4,])#>#>#> cyl hp wt #> -4.733 1.094 0.6995margins(x, data = mtcars[mtcars[["cyl"]] == 6,])#>#>#> cyl hp wt #> -12.85 0.6845 0.6995# return discrete differences for continuous terms ## passes 'change' through '...' to dydx() margins(x, change = "sd")#>#>#> cyl hp wt #> -29.87 39.83 0.6481# summary() method summary(margins(x, at = list(hp = c(95, 150))))#> factor hp AME SE z p lower upper #> cyl 95.0000 -7.2657 0.5156 -14.0925 0.0000 -8.2762 -6.2552 #> cyl 150.0000 -18.5314 1.3065 -14.1836 0.0000 -21.0921 -15.9706 #> hp 95.0000 0.6845 0.0489 13.9989 0.0000 0.5887 0.7804 #> hp 150.0000 0.6845 0.0489 13.9986 0.0000 0.5887 0.7804 #> wt 95.0000 0.6995 0.3304 2.1169 0.0343 0.0519 1.3472 #> wt 150.0000 0.6995 0.3301 2.1193 0.0341 0.0526 1.3464margins_summary(x, at = list(hp = c(95, 150)))#> factor hp AME SE z p lower upper #> cyl 95.0000 -7.2657 0.5156 -14.0925 0.0000 -8.2762 -6.2552 #> cyl 150.0000 -18.5314 1.3065 -14.1836 0.0000 -21.0921 -15.9706 #> hp 95.0000 0.6845 0.0489 13.9989 0.0000 0.5887 0.7804 #> hp 150.0000 0.6845 0.0489 13.9986 0.0000 0.5887 0.7804 #> wt 95.0000 0.6995 0.3304 2.1169 0.0343 0.0519 1.3472 #> wt 150.0000 0.6995 0.3301 2.1193 0.0341 0.0526 1.3464## control row order of summary() output summary(margins(x, at = list(hp = c(95, 150))), by_factor = FALSE)#> factor hp AME SE z p lower upper #> cyl 95.0000 -7.2657 0.5156 -14.0925 0.0000 -8.2762 -6.2552 #> hp 95.0000 0.6845 0.0489 13.9989 0.0000 0.5887 0.7804 #> wt 95.0000 0.6995 0.3304 2.1169 0.0343 0.0519 1.3472 #> cyl 150.0000 -18.5314 1.3065 -14.1836 0.0000 -21.0921 -15.9706 #> hp 150.0000 0.6845 0.0489 13.9986 0.0000 0.5887 0.7804 #> wt 150.0000 0.6995 0.3301 2.1193 0.0341 0.0526 1.3464# alternative 'vce' estimation# NOT RUN { # bootstrap margins(x, vce = "bootstrap", iterations = 100L) # simulation (ala Clarify/Zelig) margins(x, vce = "simulation", iterations = 100L) # }# specifying a custom `vcov` argument if (require("sandwich")) { x2 <- lm(Sepal.Length ~ Sepal.Width, data = head(iris)) summary(margins(x2)) ## heteroskedasticity-consistent covariance matrix summary(margins(x2, vcov = vcovHC(x2))) }#>#> factor AME SE z p lower upper #> Sepal.Width 0.7224 0.3332 2.1681 0.0301 0.0694 1.3754# generalized linear model x <- glm(am ~ hp, data = head(mtcars), family = binomial) margins(x, type = "response")#>#>#> hp #> -0.0168margins(x, type = "link")#>#>#> hp #> -0.08928# multi-category outcome if (requireNamespace("nnet")) { data("iris3", package = "datasets") ird <- data.frame(rbind(iris3[,,1], iris3[,,2], iris3[,,3]), species = factor(c(rep("s",50), rep("c", 50), rep("v", 50)))) m <- nnet::nnet(species ~ ., data = ird, size = 2, rang = 0.1, decay = 5e-4, maxit = 200, trace = FALSE) margins(m) # default margins(m, category = "v") # explicit category }#>#>#> Sepal.L. Sepal.W. Petal.L. Petal.W. #> -0.02979 -0.08066 0.1131 0.2193# using margins_summary() for concise grouped operations list_data <- split(mtcars, mtcars$gear) list_mod <- lapply(list_data, function(x) lm(mpg ~ cyl + wt, data = x)) mapply(margins_summary, model = list_mod, data = list_data, SIMPLIFY = FALSE)#> $`3` #> factor AME SE z p lower upper #> cyl -0.8380 0.5773 -1.4515 0.1466 -1.9695 0.2936 #> wt -2.4727 0.8228 -3.0051 0.0027 -4.0855 -0.8600 #> #> $`4` #> factor AME SE z p lower upper #> cyl -1.6405 1.0927 -1.5013 0.1333 -3.7822 0.5012 #> wt -5.4414 1.7008 -3.1994 0.0014 -8.7749 -2.1080 #> #> $`5` #> factor AME SE z p lower upper #> cyl -0.8501 0.8180 -1.0392 0.2987 -2.4534 0.7532 #> wt -6.0898 1.9978 -3.0482 0.0023 -10.0055 -2.1741 #>