Last week I was teaching about graphical models of causation at a summer school in Montenegro. You can find my slides and accompanying R code in the teaching section of this page. It was lots of fun and I got great feedback from students. After the workshop we had stimulating discussions about the usefulness of this new approach to causal inference in economics and business. I’d like to pick up one of those points here, as this is an argument I frequently hear when talking to people with a classical econometrics training.

PhD students in economics are usually well-trained in the potential outcome framework. Therefore, I mostly frame directed acyclic graphs (DAG) as a useful complement to the standard treatment effects estimators, in order to conceal my true revolutionary motives. ;) One concern with DAGs I sometimes encounter though is that they require so many strong assumptions about the presence (and absence) of causal relationships between variables in your model. By contrast, so the argument goes, for treatment effect estimators, such as nearest-neighbor matching, you only have to justify the exogeneity of your treatment and that’s it. No need to specify a full causal model.

This argument is misguided. You always need a “full” causal model in order to do proper causal inference. But let me specify in more detail what I mean by this. In matching (or inverse probability weighting, or regression, or any other method that relies on unconfoundedness) you encounter a situation like the following.

You would like to estimate the effect of a treatment *T *(e.g., an R&D subsidy) on an outcome variable *Y *(e.g., firm growth). The problem is that there are other variables, *X*, out there that create a correlation between the treatment and outcome. You first need to control for these confounding factors in order to get at the true causal effect of *T* on *Y*.

In the potential outcome framework this means that you need to justify the *unconfoundedness *assumption*.
*

If the treatment is independent of potential outcomes conditional on *X—*and you’re able to measure all these influence factors *X—*then you’re fine. The crux though is, what is *X*? Which variables do you need to control for? And what other influence factors can you safely keep uncontrolled for? To make these claims you need to have a causal model—at least in your mind. And here the circle closes.

Every time you estimate something that entails the unconfoundedness assumption, you imply that your data is generated by a causal process such as the one depicted above. So treatment effects estimators don’t require fewer assumptions than graphical approaches, they just apply for one very specific causal model. If that model fits reality, great! Then you can go out and apply treatment effects methods “off the shelf”. But if it doesn’t you need to think harder about an appropriate model. And DAGs offer you a tremendously useful tool set to handle these types of situations.

Here’s the causal model that applies for the second most prominent estimator from the treatment effects literature—the non-parametric IV estimator.

In this situation there is no possibility of ever controlling for all confounding influences, because some of them remain unobserved (denoted by the dashed bidirected arc between *T *and *Y)*. As a result, unconfoundedness will be violated*. *But instead you can do something else. You can use variation in a third variable *Z* to get at the causal effect of *T *on *Y.¹ *In order for that to work, you have to satisfy a very similar condition to unconfoundedness for the instrument though.

Your instrument has to be *excludable*, or independent of potential outcomes given a vector of control variables *X*. You see that you’re basically left with the same problem. How do you decide what is in *X*, and what isn’t?

In sum, treatment effects estimators such as matching and IV simply give you a template of a causal model at hand. If this template describes reality accurately you can easily find causal effect estimates with the help of standard techniques. Graphical models capture the same standard cases, but on top of that provide you with a much more versatile toolbox for causal inference. The impression that matching and IV require fewer assumptions is a misconception. I admit it’s probably still easier to convince reviewers with the standard methods, simply because we’re so used to them. But that’s just a sign for an imperfection of the scientific process and says nothing about any substantive differences between both approaches. Causal inference requires strong assumptions, one way or the other. There is no such thing as a free lunch in econometrics either.

¹ In addition, you will need to assume monotonicity, i.e., a monotone influence of *Z* on *T* for all members in the population. And even then, you can only identify an effect for a subgroup of compliers, that changes their treatment status due to the instrument (for binary *Z). *These details are of secondary importance for the argument here. If you’re interested you can check the seminal paper by Imbens and Angrist (1994) on nonparaemtric IV.