library(fixest)
library(ggplot2)
library(cowplot)
<- function(object, newdata, se.fit = FALSE,
predict_partial interval = "none",
level = 0.95){
if(missing(newdata)) {
stop("predict_partial requires newdata and predicts for all group effects = 0.")
}
<- fit_feols; newdata <- newdata; se.fit = T; interval = "confidence"; level = 0.95
object
# Extract terms object, removing response variable
<- terms(object)
tt <- delete.response(tt)
Terms
# Remove intercept
attr(Terms, "intercept") <- 0
<- model.matrix(Terms, data = newdata)
X
if (class(object) == "fixest") {
<- as.numeric(coef(object))
B <- attributes(vcov(fit_feols, attr = T))$dof.K
df else if (class(object) %in% c("lm", "felm")) {
} <- as.numeric(object$coef)
B <- object$df.residual
df else {
} stop("class(object) should be lm, fixest, or felm.")
}
<- data.frame(fit = as.vector(X %*% B))
fit
if(se.fit | interval != "none") {
<- vcov(object)
sig <- apply(X, MARGIN = 1, FUN = get_se, sig = sig)
se
}
if(interval == "confidence"){
<- qt((1 - level) / 2 + level, df = df)
t_val $lwr <- fit$fit - t_val * se
fit$upr <- fit$fit + t_val * se
fitelse if (interval == "prediction"){
} stop("interval = \"prediction\" not yet implemented")
}if(se.fit){
return(list(fit=fit, se.fit = se))
else {
} return(fit)
}
}
<- function(r, sig) {
get_se # Compute linear combination, helper function for predict_partial
# Given numeric vector r (the constants) and vcov sig (the ), compute SE
<- matrix(r, nrow = 1)
r sqrt(r %*% sig %*% t(r))
}
<- 100
N <- data.frame(x = rnorm(N), group = rep(c("A", "B"), times = N/2))
data $y <- 1 + 3 * data$x - 2 * data$x^2 + rnorm(N, 0, 0.5) + as.numeric(data$group == "A") * 3
data
<- feols(y ~ x + I(x^2) | group, data = data)
fit_feols
<- data.frame(x = seq(-3, 3, 0.1))
newdata <- predict_partial(fit_feols, newdata, se.fit = T, interval = "confidence")
preds
<- cbind(newdata, preds$fit)
plot_df
ggplot(plot_df, aes(x = x, y = fit)) +
geom_line() +
geom_line(aes(y = lwr), linetype = "dashed") +
geom_line(aes(y = upr), linetype = "dashed") +
theme_cowplot()
In climate economics and in other settings, we often would like to estimate a response function, or the outcome as a function of some covariate, i.e.,
These models, however, usually include lots of controls, including fixed effects. Consider the following (fairly generic) estimating equation:
Computationally, this could mean including a large set of fixed effects, so estimating these models via ordinary least-squares (OLS) using dummy variables is a nonstarter for settings with thousands of fixed effects, computationally speaking. Instead, we use programs such as fixest
(there are many others, but fixest
is the best one I know of, right now), which use fancy mathematical tricks to alleviate the computational burden.
One of the most intuitive ways to consider the relationships we estimate, particularly when lm
, fixest
does not have a native prediction function that can supply standard errors. This is for a good reason: strictly speaking, predicting the outcome
So, we’re basically sunk if we want to plot
To see how this works, consider that a response function is really just the expected value of the outcome across a range of
As you can see, the fixed effects drop out, so for this linear-in-parameters model at least we don’t need the entire vcov to generate standard errors for each point on the response curve. So far as I know, though, there’s no easy way to run this kind of prediction without a bit of wrangling. Typically, I’ve used survey::svycontrast
to do this sort of thing (see here), but it always felt a bit fiddly. So the following code lets us do this kind of partial prediction easily, by just passing the variables we want to predict over.2
Note that this function also works with lm
and felm
, though there’s little reason to use it for lm
given that predict.lm
works (and provides more functionality).
<- lm(y ~ x + I(x^2), data = data)
fit_lm <- predict_partial(fit_lm, newdata, se.fit = T, interval = "confidence")
preds_lm library(lfe)
Loading required package: Matrix
<- felm(y ~ x + I(x^2), data = data)
fit_felm <- predict_partial(fit_felm, newdata, se.fit = T, interval = "confidence") preds_felm
In any case, I’m happy to finally close this issue, given that it’s been on my mind for almost six years.