README

ddml is an implementation of double/debiased machine learning estimators as proposed by Chernozhukov et al. (2018). The key feature of ddml is the straightforward estimation of nuisance parameters using (short-)stacking (Wolpert, 1992), which allows for multiple machine learners to increase robustness to the underlying data generating process. See also Ahrens et al. (2024) for a detailed illustration of the practical benefits of combining DDML with (short-)stacking.

ddml is the sister R package to our Stata package, mirroring its key features while also leveraging R to simplify estimation with user-provided machine learners and/or sparse matrices. See also Ahrens et al. (2023) with additional discussion of the supported causal models and benefits of (short)-stacking.

Installation

if (!require("devtools")) {
  install.packages("devtools")
}
devtools::install_github("thomaswiemann/ddml", dependencies = TRUE)

install.packages("ddml")

Example: LATE Estimation based on (Short-)Stacking

To illustrate ddml on a simple example, consider the included random subsample of 5,000 observations from the data of Angrist & Evans (1998). The data contains information on the labor supply of mothers, their children, as well as demographic data. See ?AE98 for details.

# Load ddml and set seed
library(ddml)
set.seed(75523)

# Construct variables from the included Angrist & Evans (1998) data
y = AE98[, "worked"]
D = AE98[, "morekids"]
Z = AE98[, "samesex"]
X = AE98[, c("age","agefst","black","hisp","othrace","educ")]

ddml_late estimates the local average treatment effect (LATE) using double/debiased machine learning (see ?ddml_late). Since the statistical properties of machine learners depend heavily on the underlying (unknown!) structure of the data, adaptive combination of multiple machine learners can increase robustness. In the below snippet, ddml_late estimates the LATE with short-stacking based on three base learners:

# Estimate the local average treatment effect using short-stacking with base
#     learners ols, rlasso, and xgboost.
late_fit_short <- ddml_late(y, D, Z, X,
                            learners = list(list(fun = ols),
                                            list(fun = mdl_glmnet),
                                            list(fun = mdl_xgboost,
                                                 args = list(nrounds = 100,
                                                             max_depth = 1))),
                            ensemble_type = 'nnls1',
                            shortstack = TRUE,
                            sample_folds = 10,
                            silent = TRUE)
summary(late_fit_short)
#> LATE estimation results: 
#>  
#>         Estimate Std. Error   t value  Pr(>|t|)
#> nnls1 -0.2105019   0.195529 -1.076576 0.2816698

Learn More about ddml

Other Double/Debiased Machine Learning Packages

ddml is built to easily (and quickly) estimate common causal parameters with multiple machine learners. With its support for short-stacking, sparse matrices, and easy-to-learn syntax, we hope ddml is a useful complement to DoubleML, the expansive R and Python package. DoubleML supports many advanced features such as multiway clustering and stacking.

References

Ahrens A, Hansen C B, Schaffer M E, Wiemann T (2023). “ddml: Double/debiased machine learning in Stata.” https://arxiv.org/abs/2301.09397

Ahrens A, Hansen C B, Schaffer M E, Wiemann T (2024). “Model averaging and double machine learning.” https://arxiv.org/abs/2401.01645

Angrist J, Evans W, (1998). “Children and Their Parents’ Labor Supply: Evidence from Exogenous Variation in Family Size.” American Economic Review, 88(3), 450-477.

Chernozhukov V, Chetverikov D, Demirer M, Duflo E, Hansen C B, Newey W, Robins J (2018). “Double/debiased machine learning for treatment and structural parameters.” The Econometrics Journal, 21(1), C1-C68.

Wolpert D H (1992). “Stacked generalization.” Neural Networks, 5(2), 241-259.

ddml

Installation

Example: LATE Estimation based on (Short-)Stacking

Learn More about ddml

Other Double/Debiased Machine Learning Packages

References

Learn More about `ddml`