When there is indication of covariate imbalance, we may wish to construct a sample where the treatment and control groups are more similar than the original full sample. One way of doing so is by dropping units with extreme values of propensity scores. For these subjects, their covariate values are such that the probability of being in the treatment (or control) group is so overwhelmingly high that we cannot reliably find comparable units in the opposite group. We may therefore wish to forego estimating treatment effects for such units since nothing much can be credibly said about them.

A good rule-of-thumb is to drop units whose estimated propensity score is less than 0.1 or greater than 0.9. By default, once the propensity score has been estimated by running either `est_propensity`

or `est_propensity_s`

, a value of 0.1 will be set for the attribute `cutoff`

:

`>>> causal.cutoff 0.1`

Calling the method `trim`

at this point will drop subjects according to this rule-of-thumb. In general, we can consider dropping units whose estimated propensity lies outside of the \([\alpha, 1-\alpha]\) interval. To trim the sample at \(\alpha = 0.12\), for example, we can set `cutoff`

to 0.12 and call the method `trim`

:

`>>> causal.cutoff = 0.12 >>> causal.trim()`

Once `trim`

is called, the `causal`

instance will mutate and behave as if the subjects outside of the \([\alpha, 1-\alpha]\) propensity range no longer exists. The usual object attributes and methods will still work as expected after trimming. If we inspect `summary_stats`

at this point, for instance, we will find

`>>> print(causal.summary_stats) Summary Statistics Controls (N_c=355) Treated (N_t=324) Variable Mean S.d. Mean S.d. Raw-diff -------------------------------------------------------------------------------- Y 41.568 29.026 63.069 27.225 21.501 Controls (N_c=355) Treated (N_t=324) Variable Mean S.d. Mean S.d. Nor-diff -------------------------------------------------------------------------------- X0 3.701 2.825 4.546 2.563 0.313 X1 3.489 2.784 4.453 2.461 0.367`

Note that the output statistics are now different from what we saw earlier with the full sample. In particular, the sample sizes are now 355 and 324 instead of 392 and 608, and the normalized differences in covariate means are 0.313 and 0.367 instead of 0.706 and 0.880, showing marked improvement in covariate balance.

If the trimming was not satisfactory, we can reset `causal`

to its initial pristine state by

`>>> causal.reset()`

Note that doing so resets everything, so even the propensity scores have to be re-estimated again.

Instead of setting an arbitrary value for \(\alpha\), a procedure exists that will estimate the optimal cutoff. Crump, Hotz, Imbens, and Mitnik (2009) show that the asymptotic variance of the efficient estimator for the average treatment effect given that the covariate value \(X\) is in some subset \(\mathbb{S}\) of the covariate space is given by $$\frac{1}{\mathrm{P}(X \in \mathbb{S})} \mathrm{E}\left[ \frac{\sigma_t^2(X)}{p(X)} + \frac{\sigma_c^2(X)}{1-p(X)} \Big| X \in \mathbb{S} \right].$$

Here \(\sigma_t^2\) is the conditional variance function for treated units, and \(\sigma_c^2\) is the conditional variance function for control units.

Letting \(\mathbb{S}\) take the form of \([\alpha, 1-\alpha]\) and choosing \(\alpha\) to minimize the above asymptotic variance results in the optimal cutoff threshold. This optimal cutoff does not have a closed form solution, but can nonetheless be calculated in \(O(N \log{N})\) time.

To compute the optimal \(\alpha\) and trim the sample based on it in *Causalinference*, we simply call `trim_s`

, like below:

`>>> causal.trim_s() >>> causal.cutoff 0.0954928016329 >>> print(causal.summary_stats) Summary Statistics Controls (N_c=371) Treated (N_t=363) Variable Mean S.d. Mean S.d. Raw-diff -------------------------------------------------------------------------------- Y 41.331 29.608 66.067 28.108 24.736 Controls (N_c=371) Treated (N_t=363) Variable Mean S.d. Mean S.d. Nor-diff -------------------------------------------------------------------------------- X0 3.709 2.872 4.658 2.522 0.351 X1 3.407 2.784 4.661 2.517 0.472`

As the above show, the optimal cutoff was determined to be around 0.0955. Note that it is not necessary to call `trim`

after `trim_s`

was called as the trimming occurs automatically as part of `trim_s`

.

Since `causal`

still behaves like a regular `CausalModel`

instance after trimming, it is possible at this point to run any treatment effect estimation procedures that are available, including `est_via_ols`

and `est_via_matching`

. It is important to note, however, that the estimand is now the average treatment effect restricted to a subpopulation: $$\mathrm{E}[Y(1)-Y(0) | \alpha < p(X) < 1-\alpha].$$

Again, this is because when covariate imbalance is an issue, estimates of the unconditional average treatment effect \(\mathrm{E}[Y(1)-Y(0)]\) are inherently much less credible.

### References

Crump, R., Hotz, V. J., Imbens, G., & Mitnik, O. (2009). Dealing with limited overlap in estimation of average treatment effects. *Biometrika, 96*, 187-199.