Why Tobit models are overused

In my field of research we’re often running regressions with innovation expenditures or sales with new products aon the left-hand side. Usually we observe many zeros for these variables because firms do not invest at all in R&D and therefore also do not come up with new products. Many researchers then feel inclined to use Tobit models. But frankly, I never understood why. In an earlier version of one of my papers I’ve commented on this:

Authors in previous studies (Laursen and Salter, 2006; Leiponen and Helfat, 2010, 2011; Klingebiel and Rammer, 2014) often rely on limited dependent variable models, namely a Tobit type I regression (Tobin, 1958; Amemiya, 1985), because they recognize the non-negativity of sales with new products. In agreement with Angrist and Pischke (2009), we break with this tradition as we do not make sense of a latent variable interpretation with a separate censoring mechanism that forces negative sales to be zero. Rather we think that zeros occur naturally in this setting. Another justification for the Tobit model is sometimes provided by a hurdle model interpretation (Cragg, 1971). Here, the censoring point is thought as a threshold of “participation” which is modeled by a separate probabilistic process. Excess zeros (e.g., relative to the likelihood of a normal distribution) occur because a part of the sample is simply reluctant to engage in any innovation activities. We think that such a two-part approach is not appropriate for our application either as some form of innovation activity is a necessary condition to appear in our sample. In addition, we do not require fitted values to satisfy boundary conditions at the lower ends of the distribution, since we are not interested in effects that appear in certain distributional ranges of the dependent variable. Estimation by ordinary least squares, in contrast, conveniently allows to incorporate cluster-robust standard errors (clustered at the firm level) which is advisable when analyzing survey data considering that some firms appear in both survey waves.
A couple of (reiterating) points:
  • Cases of firms with no innovation expenditures (or sales with new products) are natural zeros. There is no censoring or truncation mechanism that forces negative expenditures to appear as zeros in the data. The actual value is zero, period.
  • Some people worry that with lots of zeros in the data the distribution of the outcome variable, Y, becomes very skewed. First of all, OLS can handle that as it doesn’t require normal errors for consistency. And secondly, if you worry about skewness there are other models you could use, such as Poisson, which are more robust to distributional misspecifications than Tobit.
  • Most importantly, if you introduce a latent variable (as in Tobit) you better have a good structural interpretation for it (like Heckman in his female labor supply example).  If you, e.g., argue that zero innovation expenditures are the result of a firm’s profit maximization problem—in which expected future cash flows are traded-off against project costs—then you should model this decision explicitly and tell me why you’re specifically interested in the effects on the latent rather than the observed variable. Everything else is too handwavy for my taste. In other words, if you’re doing reduced-form econometrics, do it properly (or switch to fully-fledged structural otherwise)!
  • A Heckman selection model “is equivalent to a Tobit model with stochastic threshold” (Cameron & Trivedi 2005, ch. 16.5.2) and therefore relies on a similar set of strong distributional assumptions. So if you’re worried about endogeneity as a result of sample selection I would usually advise you to go with two-stage least squares instead.