Causal Inference in Python

Causal Inference in Python, or Causalinference in short, is a software package that implements various statistical and econometric methods used in the field variously known as Causal Inference, Program Evaluation, or Treatment Effect Analysis.

Through a series of blog posts on this page, I will illustrate the use of Causalinference, as well as provide high-level summaries of the underlying econometric theory with the non-specialist audience in mind. Source code for the package can be found at its GitHub page, and detailed documentation is available at


In this final post we outline the typical flow of a causal study in relations to the main tools provided by Causalinference. At a high level, assuming that unconfoundedness holds true in the given problem, we can break down a typical study into two phases.

In the design phase, we inspect and manipulate the data set to ensure that the most credible analysis can be conducted on it. We achieve this by proceeding in the following steps:

  1. Assess covariate balance with summary_stats. If the normalized differences in covariate means suggest severe covariate imbalance, we can try to address it by using the propensity-score-based techniques below.
  2. Estimate propensity score with est_propensity_s, so that the following two propensity methods can be employed.
  3. Trim sample with trim_s to exclude subjects with extreme propensity scores. Since very little can be credibly said about such units, we should focus attention on the remaining units that exhibit a higher degree of covariate balance.
  4. Stratify sample with stratify_s to group similar subjects together and improve within-bin covariate balance.

In the analysis phase, we estimate treatment effects using a number of reasonable estimators. The estimators with the most desirable properties are

  1. The blocking estimator invoked by est_via_blocking, which aggregates the least squares estimates within each propensity bin to produce an overall average treatment effect estimate.
  2. The matching estimator invoked by est_via_matching, which pairs subjects together via nearest-neighborhood matching to arrive at an overall average treatment effect estimate. Since bias can result when the matching is imperfect, bias correction is recommended.

If the design phase is done well, the two estimates recommended above should result in similar estimates due to their stability properties.

We conclude by noting that additional checks and tests exist for assessing the validity of a causal study. Some of these tests (e.g. for assessing the unconfoundedness assumption) actually require no additional tools beyond what is provided in Causalinference. For the interested reader, please refer to Imbens (2014).


Imbens, G. (2014). Matching methods in practice: Three examples. NBER Working Paper No. 19959.