# How to plot a specification curve

programming
Author

Patrick Baylis

Published

February 28, 2020

Like many researchers, I often want to plot a range of coefficient estimates to figure out if the results I’m finding are robust to other sensible specification and functional form choices. This kind of estimate is called a sensitivity curve (Simonsohn, Simmons, and Nelson 2015), and I am far from the first to do it. In fact, there are even a couple packages available: Joachim Gassen’s `rdfanalysis` and Philipp Masur’s `specr` (I haven’t used either, yet).

I wanted to roll my own, though. Since my code is fairly simple (less than 60 lines including comments and generous spacing) and uses the `tidyverse`, it may be helpful to other people too. First, here’s the output. It’s certainly not the prettiest version of a specification curve, but it gets the job done. This is my specification curve. There are many like it, but this one is mine.

Second, here’s the code (also available as a gist) that creates it.

``````# Create sensitivity curve of coefficient estimates
library(tidyverse)
library(cowplot)
library(fastDummies)

# Setup ----
rm(list = ls())
theme_set(theme_cowplot())
set.seed(42)

# Create some fake estimates to plot ----
# Required components: est, se, and any variables that describe specifications

# These are the possible specifications we "estimated"
fe <- c("Unit", "Time", "Unit + Time")
controls <- c("Basic", "Some", "Full")

estimates <- as_tibble(expand.grid(`Fixed Effects` = fe, `Controls` = controls))
estimates <- estimates %>% mutate(est = rnorm(n()), se = runif(n(), 0, 0.1))

# Create a plot of the estimates ----
spec_cols <- c("Fixed Effects", "Controls") # Could be set by user or figured out
# Note: This assumes the preferred ordering of the specification categories is the order in which they are given

estimates <- estimates %>% mutate(ci_l = est - 1.96 * se, ci_h = est + 1.96 * se)

estimates <- estimates %>%
arrange(est) %>% mutate(h_order = 1:n()) # Sort on point estimates for horizontal ordering

coef_plot <- ggplot(estimates, aes(x = h_order, y = est)) +
geom_linerange(aes(ymin = ci_l, ymax = ci_h), size = 1, alpha = 0.5) +
geom_point(fill = "white", shape = 21) +
labs(y = "Coefficient") +
theme(axis.title.x = element_blank(), axis.ticks.x = element_blank(), axis.line.x = element_blank(), axis.text.x = element_blank())
coef_plot

# Create a plot of the specifications ----

# Function to create a specification plot for a single category.
make_spec_plot <- function(category) {
# category = spec_cols # DEBUG
specs <- dummy_cols(estimates, select_columns = category, remove_selected_columns = T) %>%
select(h_order, starts_with(category)) %>%
pivot_longer(starts_with(category), names_prefix = paste0(category, "_")) %>%
mutate(name = factor(name, levels = rev(unique(name))))

spec_plot <- ggplot(specs, aes(x = h_order, y = name, alpha = value)) +
geom_point() +
scale_alpha_continuous(guide = FALSE) +
theme(axis.title.x = element_blank(), axis.ticks.x = element_blank(), axis.line.x = element_blank(), axis.text.x = element_blank()) +
theme(axis.text.y = element_text(size = 6), axis.title.y = element_blank(), axis.ticks.y = element_blank(), axis.line.y = element_blank())
spec_plot
}
spec_plots <- lapply(spec_cols, make_spec_plot)

combined_plot <- plot_grid(plotlist = c(list(coef_plot), spec_plots),
labels = c("", spec_cols), label_size = 8, label_fontface = "italic", vjust = 0.5, hjust = -0.1,
rel_heights = c(1, 0.2, 0.2), ncol = 1, align = "v")
save_plot("sensitivity-curve.png", combined_plot)``````