# Just a little Bartik exposure

There are no two ways about it: exposure designs are *real* popular. Taking an example from some of our ongoing work, exposure designs can help us answer questions such as “did local production of high-quality masks help slow the spread of COVID?” So what is an exposure (i.e., shift-share) design, and why/when is it safe to believe it returns the causal impacts of an aggregate differential exposure of interest?

In this blog post, we review recent progress advancing our understanding of exposure designs, use variation across units in exposure to higher level shocks. We cover what exposure designs estimate, what assumptions are necessary to estimate causal effects of exposure, and diagnostic tests for these assumptions. David’s recent blog post discussed instrumental variable approaches; we’ll touch on reduced form analysis, and discuss two distinct approaches to recovering causal effects — assuming exposure “shares” are as good as random, and assuming exposure “shocks” are as good as random.

To review a little history (via Goldsmith-Pinkham et al. (2020)), recent use of exposure designs in economics began in Chapter 5 of Bartik (1991), which estimates the impacts of “[expected] growth [in employment] if all local industries had grown at the national growth rate for that industry, and thus reflects trends in national demand for the [local] area’s exports.” While this eponymous “Bartik instrument” has been perhaps the most common instrument across exposure designs, closely related approaches have been applied to answer a number of important questions in economics. This includes the impacts of migration on local labor markets (Card, 2001), the impacts of food aid on conflict (Nunn & Qian, 2014), and as we revisit later, the impacts of import competition from China on local labor markets (Autor et al., 2013).

## What is a shift-share?

### Constructing exposure

To fix ideas, and following this recent post, we consider a simplified example of estimating the impacts of increased competition from Chinese imports on employment in US counties based on Autor et al., 2013 (“ADH”). To estimate these impacts, one approach would be to simply regress

\[ \text{Employment}_{i} = \alpha + \beta \text{Imports from China}_{i} + \epsilon_{i} \]

where i represents US counties. Unfortunately, \( \beta \) is unlikely to have our desired causal interpretation. If county i is very productive, both employment and imports from China are likely to be high, generating omitted variable bias.

As an alternative, we can attempt to construct a measure of competition from Chinese imports that is less likely to be affected by productivity (and other unobservables) in county i. Instead, we could construct exposure to import competition through county i’s industrial composition. Using j to index industries, we can define county i’s exposure as the sum of US imports in each industry j, times US total imports from industry j in China per US worker in industry j

\[ \text{Exposure to import competition}_{i} = \sum_{j} \text{Employment share}_{ij} * \text{Imports from China per US worker}_{j} \]

This exposure measure is a shift share — for each county, we construct exposure to import competition by multiplying “shares” (here, share of county i’s employment in industry j) by “shifts” or “shocks” (here, imports from industry j in China), and summing across industries. As the “shock” is constant within industry across counties, variation in exposure to import competition across counties comes from variation across counties in the “shares” — counties with high exposure to import competition are counties that have high employment (large “shares”) in industries with relatively high imports (large “shocks”).

Note that we can interpret exposure to import competition as a predicted outcome — if imports from from industry j in China substitute for US workers in industry j, then exposure to import competition can be interpreted as the predicted decrease in employment in county i. It is the sum of the average predicted decreases in employment (imports from China per US worker in industry j, or the “shock”) times their relevance to county i (the share of county i’s employment in industry j, or the “share”). Many shift share exposure measures have this flavor — they construct a predicted outcome by summing over group level average changes in the outcome times unit specific weights on each group.

### Exogeneous exposure?

To estimate the impacts of increased exposure to import competition, we can regress

\[ \text{Employment}_{i} = \alpha + \beta \text{Exposure to import competition}_{i} + \epsilon_{i} \]

For \( \beta \) to recover the causal effect of exposure to import competition, it is hopefully the case that exposure to import competition is as good as random. However, this assumption is hard to interpret given how we’ve constructed exposure to import competition. First, it is constructed from industry level shocks (imports from China per US worker in industry j), that are constant across counties — by construction, these shocks cannot be randomly assigned across US counties. Second, these shocks are multiplied potentially non-random shares (county i’s employment in industry j). This seems like we are combining two non-random sources of variation and hoping to recover something as good as random!

Below, we’ll discuss recent work that asks exactly this question — what assumptions are needed to recover the causal effect of a shift share exposure measure? It turns out that we cannot simply combine two non-random sources of variation and hope to recover something as good as random; instead, it is sufficient if either the shares (county i’s employment in industry j) are random across counties, or the shocks (imports from China per US worker in industry j) are random across industries.

## So, what’s new in shift-share?

### Exogenous shares

**Identification** The first approach we consider is that the shares (county i’s employment in industry j) are randomly assigned across countries, as considered in Goldsmith-Pinkham et al. (2020) (“GSS”). To understand this, they show that the estimated effect of exposure is simply a weighted sum across industries j of coefficients from regressions of county outcomes on county shares for industry j, with weights depending on the shocks and the variance of the shares. That is, when we regress employment on exposure to import competition, the estimated effect of exposure is exactly a weighted sum of coefficients from regressions of employment in county i on county i’s employment in industry j for each j. If these shares are as good as random, each of these coefficients represents a causal effect of exposure to industry j.

Lastly, for our causal effects to be interpretable as reflecting the impacts of import competition, it must be the case that exposure to industry j only affects county i’s employment through its effect on import competition. Note that exclusion restrictions are not particularly testable (e.g., see this DI blog post on the exclusion restriction for rainfall instruments) so we need to come up with alternative diagnostic tests.

**Diagnostic tests** GSS propose 3 diagnostic tests for the validity of the assumption that the shares are as good as random:

1) Balance checks: First, one can test the correlation between exposure and county characteristics. If we thought exposure was as good as randomly assigned, it should be uncorrelated with fixed county characteristics. As correlations with exposure may be complicated to interpret, we may also implement this test using just a single industry’s share that explains the most variation in exposure.

2) Placebo tests: Second, if impacts of exposure are only expected after a certain time period (if, for example, the shocks used to construct exposure occurred after a certain date), then we should not expect impacts of exposure on outcomes in earlier time periods.

Note that in general, both balance checks and placebo tests (and associated robustness tests) can be extended from the one time period analysis we discuss to difference-in-differences; see our blog posts on this here and here.

3) Alternative shocks: Third, above we assumed that the shares are as good as random, and we noted that the shocks effectively act as weights on the shares when estimating the effect of exposure. Therefore, finding similar estimates with alternative choices of shocks, as long as the resulting weights are sufficiently different, provides reassurance that estimates are not overly sensitive to the choice of which shares are used.

**Inference** Lastly, we note that although shares vary across counties, it is often the case that shares may be strongly correlated across neighboring counties. In general, cluster robust standard errors accounting for correlated assignment of shares may be necessary (see discussion of how to choose the level at which you cluster here).

### Exogenous shocks

**Identification** The second approach we consider is that the shocks (imports from China per US worker in industry j) are as good as random across industries. Borusyak et al. (2020) (“BHJ”) show that the estimated effect of exposure can instead be represented as an industry-level weighted regression of share weighted sums of outcomes on the industry-level shocks. That is, when we regress employment on exposure to import competition, the estimated effect of exposure is identical to the estimated weighted effect of the industry-level shocks on particular weighted sums of outcomes. If these shocks are as good as random, this regression will recover the effect of the shocks on outcomes through exposure.

**Diagnostic tests** BHJ propose the following tests of the validity of the exposure design and the assumption that the shocks are as good as random:

1) Effective number of observations: First, even when the number of industries is large, a small number of industries may receive a large amount of weight, if most of the variation in shares comes from those industries. BHJ propose a calculation of the effective number of industries, and find in practice it can be quite small — in these cases, standard approaches to inference are likely to perform poorly.

2) Balance and placebo checks: Second, if industry level shocks are in fact as good as random, they should be uncorrelated with industry characteristics and pre-shocks weighted sums of outcomes. These tests can be implemented at the industry-level, following the same approach as the main analysis of impacts.

Note here that one might want to include both industry-level controls for industry characteristics correlated with shocks, or county-level controls for county characteristics correlated with import competition exposure. These can both be done simultaneously — in the industry-level regression, one would include controls for both industry characteristics and share weighted sums of county characteristics.

**Inference** In this setting, inference may be slightly more complicated than when shares are exogenous, as our unit of observation (the county) is different from the unit at which treatment is assigned (the industry). Three approaches have been proposed to conduct valid inference in this setting. First, Adao et al. (2019) propose a correction for correlation in errors across counties with similar shares. Second, BHJ proposes aggregating to conduct the analysis at the industry-level, and calculating heteroskedasticity robust or cluster robust standard errors; they show the former is equivalent to the proposed correction in Adao et al. (2019). Third, the randomization inference approach proposed by Borusyak & Hull (2021) and discussed below can be applied, with randomization inference leveraging an estimated data generating process for the shocks.

## Extension to related designs

We considered cases above where shift share exposure was constructed from a mix of as good as random variation (either in shares or in shocks) and non-random variation. A much broader set of interventions of interest are similarly constructed from combining random and non-random variation — examples include measures of access to markets in trade that combine as good as random infrastructure construction with variation in existing infrastructure and market sizes, or exposure to new technologies through social networks that combine random seeding of the technology with variation in adoption of the technology and social network connections. Borusyak & Hull (2021) demonstrate how to isolate the as good as random variation in these more general exposure measures, and calculate standard errors using randomization inference leveraging only the as good as random variation.

In one variant of shift share designs, shift share exposure measures are constructed for observations over time, with observation specific shares multiplied by time period specific shocks. When shocks are assumed to be as good as random, estimation in this setting collapses into an analysis of a time series of shocks and within-time period regressions of outcomes on shares. Unless shares are as good as random, standard cluster robust inference for panel data will not yield valid inference. Instead, serial correlation in this setting effectively makes this analysis a challenging time series econometrics problem — Christian & Barrett (2017) provide diagnostic tests and potential solutions.

## Join the Conversation