At a recent conference dinner conversation with applied microeconomist friends, the same thing happened that happens at each of these dinners once you run out of steam complaining about Referee #2 — you start wondering… should we really be running cross country regressions? Inevitably, two concerns about cross country regressions came up. The first is causality; we’re not going to talk about this one today, but this can bias your estimates. The second, less discussed and often sneakier, issue with these analyses is that calculating your standard errors is also challenging! In common cross country regressions, both left hand side and right hand side variables are strongly spatially correlated, and this causes known problems for getting your standard errors right (discussed here).
Case study: Absolute latitude and GDP
One useful case study for these challenges is the well researched correlation between absolute latitude (or distance to the equator in degrees of latitude) and GDP. Lots of ink has been spilled on proposed explanations, ranging from the diffusion of agriculture to ultraviolet radiation. In this post, we consider an alternative possibility: this correlation may be a spurious artifact of spatial correlation. In particular, neighboring countries tend to have similar latitudes and similar GDP. As a result, for any possible “random assignment” of latitudes, many rich countries might happen to have similar (either high or low) absolute latitudes.
Randomization inference in space
We can test this by randomly drawing equators. This process is visualized in the figure above. First (top panel), we dropped 100,000 “north poles” on random points on the surface of the Earth and calculated the associated random equators. Second (bottom panel), we regress actual GDP on the newly assigned absolute latitudes.
As a brief aside; in practice, implementing this required remembering some trigonometry, and making use of a neat property of multivariate normal random variables (remember the assumption of “spherical” errors from econometrics?). These details, and all of the code used to generate the figures and analysis for this blog post, are posted on GitHub in all their gory detail.
This procedure is an example of randomization inference (discussed previously on this blog here, including links to discussions of some of its nice technical properties). Typically, this is used for experiments: when we know an intervention is randomly assigned, then we should estimate a zero-effect using random assignments that did not occur. Due to sampling noise, some estimates using these counterfactual random assignments will happen to be large; the share that are larger than our actual estimate gives us the p value associated with our estimate.
However, randomization inference can also be used with natural experiments, in cases when the researcher is willing to make assumptions about what is random about the right-hand side. This has some important advantages over conventional robust standard errors, including avoiding the need to make assumptions about the left-hand side!
Using this example, we find that conventional standard errors, including those intended to correct for spatial correlation, are severely biased.
First, using randomization inference, we calculate a p value of .032; 3.2% of coefficients estimated using randomly drawn equators were larger in magnitude than the coefficient using the actual equator. This is in contrast with robust standard errors clustered at the country level, which return a p value of <.0000000000000002!
This difference is not automatic with randomization inference for a natural experiment, as it is important to identify what determines variation in the right-hand side. In our case, we have assumed that variation in absolute latitude comes from the “random” location of the north pole, so we randomly dropped new north poles. Alternatively, one could have (wrongly) assumed that variation in absolute latitude is random across countries. In this case, to conduct randomization inference, one would randomly permute absolute latitude across countries. We compare the distribution of estimates under these two approaches (“Spatial” and “Naive”, respectively) in the top panel of Figure 2. The latter naive approach does not account for spatial correlation in absolute latitude; as a result, this naive approach returns a much narrower distribution of estimates, yielding a very small p value (comparable to using robust standard errors clustered at the country level).
One commonly proposed solution to spatial correlation is to use Conley standard errors. However, similar to cluster robust standard errors, these perform well only when there is a reasonable number of independent clusters (typically at least 30). In the bottom panel of our second figure, we compare Conley standard errors, using a radius of allowed spatial dependence ranging from 0km to 6000km, to the randomization inference standard errors. Although Conley standard errors (with a degrees-of-freedom correction) perform much better than cluster robust, they are biased downward by at least 20% at all radii. The likely source of this bias is that the effective number of independent clusters falls below 30, a common threshold used for having too few clusters, for radii above 1000km. While these specific results do not necessarily generalize beyond cross country studies (or to different right-hand side variables), the broad intuition holds whenever your left-hand side and right-hand side are both strongly spatially correlated.
- Inference with natural experiments is hard! Unlike traditional experiments, the right hand side may not be explicitly randomized across units, and robust standard errors may perform poorly when the effective number of clusters is very small (although effective corrections do exist for cluster robust standard errors with few clusters).
- It may not always be obvious that there is a small number of clusters! To get inference right, it’s important to be explicit about where the variation in your right hand side is, i.e., what is the assignment mechanism (discussed here). Although clustering at the correct level or using the correct radius for Conley errors can correct for these problems, it is typically ideal to simulate the assignment mechanism for randomization inference.
- In this case, we are asserting the source of variation in absolute latitude comes from the location of the poles. To give another example, for rainfall (a popular source of shocks), it comes from variation in year-to-year weather realizations; recent work has therefore proposed permuting historical weather realizations across years to conduct randomization inference for estimating the impacts of weather shocks.
- When it’s not possible to fully simulate the assignment mechanism, placebo checks can be helpful as well. For example, work studying infrastructure has showed that unrealized construction plans do not have significant (economically or statistically) impacts on outcomes; this can be a useful validation of the procedures used for both estimation and calculating standard errors.