Causal Inference in Python

Causal Inference in Python, or Causalinference in short, is a software package that implements various statistical and econometric methods used in the field variously known as Causal Inference, Program Evaluation, or Treatment Effect Analysis.

Through a series of blog posts on this page, I will illustrate the use of Causalinference, as well as provide high-level summaries of the underlying econometric theory with the non-specialist audience in mind. Source code for the package can be found at its GitHub page, and detailed documentation is available at causalinferenceinpython.org.

Trimming

When there is indication of covariate imbalance, we may wish to construct a sample where the treatment and control groups are more similar than the original full sample. One way of doing so is by dropping units with extreme values of propensity scores. For these subjects, their covariate values are such that the probability of being in the treatment (or control) group is so overwhelmingly high that we cannot reliably find comparable units in the opposite group. We may therefore wish to forego estimating treatment effects for such units since nothing much can be credibly said about them.

A good rule-of-thumb is to drop units whose estimated propensity score is less than 0.1 or greater than 0.9. By default, once the propensity score has been estimated by running either est_propensity or est_propensity_s, a value of 0.1 will be set for the attribute cutoff:

>>> causal.cutoff
0.1

Calling the method trim at this point will drop subjects according to this rule-of-thumb. In general, we can consider dropping units whose estimated propensity lies outside of the \([\alpha, 1-\alpha]\) interval. To trim the sample at \(\alpha = 0.12\), for example, we can set cutoff to 0.12 and call the method trim:

>>> causal.cutoff = 0.12
>>> causal.trim()

Once trim is called, the causal instance will mutate and behave as if the subjects outside of the \([\alpha, 1-\alpha]\) propensity range no longer exists. The usual object attributes and methods will still work as expected after trimming. If we inspect summary_stats at this point, for instance, we will find

>>> print(causal.summary_stats)

Summary Statistics

                       Controls (N_c=355)         Treated (N_t=324)             
       Variable         Mean         S.d.         Mean         S.d.     Raw-diff
--------------------------------------------------------------------------------
              Y       41.568       29.026       63.069       27.225       21.501

                       Controls (N_c=355)         Treated (N_t=324)             
       Variable         Mean         S.d.         Mean         S.d.     Nor-diff
--------------------------------------------------------------------------------
             X0        3.701        2.825        4.546        2.563        0.313
             X1        3.489        2.784        4.453        2.461        0.367

Note that the output statistics are now different from what we saw earlier with the full sample. In particular, the sample sizes are now 355 and 324 instead of 392 and 608, and the normalized differences in covariate means are 0.313 and 0.367 instead of 0.706 and 0.880, showing marked improvement in covariate balance.

If the trimming was not satisfactory, we can reset causal to its initial pristine state by

>>> causal.reset()

Note that doing so resets everything, so even the propensity scores have to be re-estimated again.

Instead of setting an arbitrary value for \(\alpha\), a procedure exists that will estimate the optimal cutoff. Crump, Hotz, Imbens, and Mitnik (2009) show that the asymptotic variance of the efficient estimator for the average treatment effect given that the covariate value \(X\) is in some subset \(\mathbb{S}\) of the covariate space is given by $$\frac{1}{\mathrm{P}(X \in \mathbb{S})} \mathrm{E}\left[ \frac{\sigma_t^2(X)}{p(X)} + \frac{\sigma_c^2(X)}{1-p(X)} \Big| X \in \mathbb{S} \right].$$

Here \(\sigma_t^2\) is the conditional variance function for treated units, and \(\sigma_c^2\) is the conditional variance function for control units.

Letting \(\mathbb{S}\) take the form of \([\alpha, 1-\alpha]\) and choosing \(\alpha\) to minimize the above asymptotic variance results in the optimal cutoff threshold. This optimal cutoff does not have a closed form solution, but can nonetheless be calculated in \(O(N \log{N})\) time.

To compute the optimal \(\alpha\) and trim the sample based on it in Causalinference, we simply call trim_s, like below:

>>> causal.trim_s()
>>> causal.cutoff
0.0954928016329
>>> print(causal.summary_stats)

Summary Statistics

                       Controls (N_c=371)         Treated (N_t=363)             
       Variable         Mean         S.d.         Mean         S.d.     Raw-diff
--------------------------------------------------------------------------------
              Y       41.331       29.608       66.067       28.108       24.736

                       Controls (N_c=371)         Treated (N_t=363)             
       Variable         Mean         S.d.         Mean         S.d.     Nor-diff
--------------------------------------------------------------------------------
             X0        3.709        2.872        4.658        2.522        0.351
             X1        3.407        2.784        4.661        2.517        0.472

As the above show, the optimal cutoff was determined to be around 0.0955. Note that it is not necessary to call trim after trim_s was called as the trimming occurs automatically as part of trim_s.

Since causal still behaves like a regular CausalModel instance after trimming, it is possible at this point to run any treatment effect estimation procedures that are available, including est_via_ols and est_via_matching. It is important to note, however, that the estimand is now the average treatment effect restricted to a subpopulation: $$\mathrm{E}[Y(1)-Y(0) | \alpha < p(X) < 1-\alpha].$$

Again, this is because when covariate imbalance is an issue, estimates of the unconditional average treatment effect \(\mathrm{E}[Y(1)-Y(0)]\) are inherently much less credible.

References

Crump, R., Hotz, V. J., Imbens, G., & Mitnik, O. (2009). Dealing with limited overlap in estimation of average treatment effects. Biometrika, 96, 187-199.