Skip to contents

This package implements the synthetic difference in difference estimator (SDID) for the average treatment effect in panel data, as proposed in Arkhangelsky et al. (2019). We observe matrices of outcomes Y and binary treatment indicators W that satisfy Yij=Lij+τijWij+εijY_{ij} = L_{ij} + \tau_{ij} W_{ij} + \varepsilon_{ij}. Here τij\tau_{ij} is the effect of treatment on unit ii at time jj, and we estimate the average effect of treatment when and where it happened (the average of τij\tau_{ij} over the observations with Wij=1W_{ij} = 1). All treated units must begin treatment simultaneously, so WW is a block matrix: Wij=1W_{ij} = 1 for i>N0i > N_0 and j>T0j > T_0 and zero otherwise, with N0N_0 denoting the number of control units and T0T_0 the number of observation times before onset of treatment. This applies, in particular, to the case of a single treated unit or treated period.

What’s New

Version 2.0 introduces a modern formula-based interface similar to lm() and plm(), while maintaining 100% backward compatibility:

  • Formula interface: synthdid(outcome ~ treatment, data, index = c("unit", "time"))
  • Standard R methods: summary(), coef(), confint(), predict(), residuals(), fitted()
  • Easy method comparison: Switch between estimators with update(result, method = "did")
  • Enhanced performance: RcppArmadillo with AVX vectorization for 3-8x speedup

See the new vignette for details.

Installation

The current development version can be installed from source using devtools.

devtools::install_github("ZhenyaKosovan/synthdid")

Quick Start

library(synthdid)

# Estimate the effect of California Proposition 99 on cigarette consumption
data("california_prop99")

# Modern formula interface
set.seed(12345)
result <- synthdid(PacksPerCapita ~ treated,
                   data = california_prop99,
                   index = c("State", "Year"),
                   se = TRUE,
                   se_method = "placebo")

# Use standard R methods
print(result)
summary(result)
coef(result)
confint(result)
plot(result)

# Compare estimators easily
did_result <- update(result, method = "did")
sc_result <- update(result, method = "sc")

# Predictions and diagnostics
effect_curve <- predict(result, type = "effect")
counterfactual <- predict(result, type = "counterfactual")
residuals(result, type = "control")

Classic Interface (Still Supported)

The original matrix-based interface continues to work exactly as before:

library(synthdid)

# Estimate the effect of California Proposition 99 on cigarette consumption
data("california_prop99")
setup <- panel.matrices(california_prop99)
tau_hat <- synthdid_estimate(setup$Y, setup$N0, setup$T0)

# Note: SE estimation requires re-estimation of the model N-replications times. It can be time consuming!
se <- sqrt(vcov(tau_hat, method = "placebo"))
sprintf("point estimate: %1.2f", tau_hat)
sprintf("95%% CI (%1.2f, %1.2f)", tau_hat - 1.96 * se, tau_hat + 1.96 * se)
plot(tau_hat)

Key Features

Multiple Estimators

Switch between different panel data estimators:

# Synthetic Difference-in-Differences (default)
synthdid_est <- synthdid(outcome ~ treatment, data, index, method = "synthdid")

# Pure Difference-in-Differences
did_est <- synthdid(outcome ~ treatment, data, index, method = "did")

# Synthetic Control
sc_est <- synthdid(outcome ~ treatment, data, index, method = "sc")

Covariate Adjustment

Include time-varying covariates (if available in your dataset):

# Example syntax - requires covariates in your data
result <- synthdid(outcome ~ treatment | covariate1 + covariate2,
                   data = your_data,
                   index = c("unit", "time"))

Standard Error Methods

Choose from multiple SE estimation methods:

# Bootstrap (default, most reliable)
result <- synthdid(outcome ~ treatment, data, index,
                   se = TRUE, se_method = "bootstrap", se_replications = 200)

# Jackknife (faster)
result <- synthdid(outcome ~ treatment, data, index,
                   se = TRUE, se_method = "jackknife")

# Placebo (for single treated unit)
result <- synthdid(outcome ~ treatment, data, index,
                   se = TRUE, se_method = "placebo", se_replications = 100)

Speeding Up Standard Error Computation

Bootstrap and placebo standard errors use furrr under the hood. You can enable parallel execution by setting a future plan:

library(future)
library(synthdid)

data("california_prop99")

# Cache current plan (by default it's `sequential`)
old_plan <- future::plan()
on.exit(future::plan(old_plan), add = TRUE)

# Allow R to spawn 4 parallel processes
future::plan(future::multisession, workers = 4)

# Estimate with parallel SE computation
result <- synthdid(PacksPerCapita ~ treated,
                   data = california_prop99,
                   index = c("State", "Year"),
                   se = TRUE,
                   se_method = "bootstrap",
                   se_replications = 200)

print(result)
summary(result)

Note: on some platforms (e.g., CRAN macOS/Windows builders) multisession may be restricted; in that case future::plan() will fall back to sequential execution.

Performance Optimizations

This package uses RcppArmadillo with AVX vectorization for significant performance improvements:

  • 3-8x faster matrix-vector operations using optimized BLAS
  • 2-4x faster tensor operations for covariate adjustment
  • Automatic use of AVX128/256 SIMD instructions on modern CPUs
  • Zero overhead for formula interface

Package Interface Comparison

Task Old Interface New Interface
Basic estimation setup <- panel.matrices(data)
synthdid_estimate(setup$Y, setup$N0, setup$T0)
synthdid(outcome ~ treatment, data, index)
Get coefficient c(result) coef(result)
Summary Custom function summary(result)
Confidence intervals Manual calculation confint(result)
Predictions Custom calculation predict(result, type = "counterfactual")
Method comparison Re-run with different function update(result, method = "did")

Documentation

For detailed examples and use cases, see:

References

Dmitry Arkhangelsky, Susan Athey, David A. Hirshberg, Guido W. Imbens, and Stefan Wager. Synthetic Difference in Differences, 2019. arXiv

Contributing

Contributions are welcome! Please see the GitHub repository for more information.