Syndicate content

Rethinking identification under the Bartik Shift-Share Instrument

David McKenzie's picture
While it has been said that “friends don’t let friends use IV”, one exception has been the Bartik or shift-share instrument. Development economists tend to see these instruments used most in the trade and migration literatures, with Jaeger et al. (2018) noting that “it is difficult to overstate the importance of this instrument for research on immigration. Few literatures rely so heavily on a single instrument or variants thereof”, going on to list 60 publications in the last decade alone. Two recent papers by Goldsmith-Pinkham et al. (2018) and Jaeger et al. (2018) look under the hood of this instrument, and provide econometric and economic reasons to re-consider how it has been used.

What is the Bartik or Shift-Share Instrument?
Let’s illustrate this first with a trade example, and then a migration example.
A key question in trade has been the impact of Chinese imports on manufacturing employment in U.S. cities (locations denoted by l):

Change in Manufacturing Employment Rate (l,t) = a + b*Import Exposure (l,t) + e(l,t)

With import exposure defined by:

Where the z are the start of the period shares of employment in location l in each industry k, and gUS is a normalized measure of growth in imports from China to the U.S.

The usual concern here is then that import exposure is not exogenous, but may be correlated with other characteristics of a location that also affect manufacturing employment. The Bartik instrument used by Autor et al. (2013) is then to use as an instrument:


Where the z now are lagged (“initial”) shares of employment in location l in industry k, and ghigh-incomeis growth of imports from China to other high-income countries. That is, the predicted exposure of a location to Chinese imports is a weighted average of how much China is exporting in general of different products (the “shift”), with weights that come from the initial industry composition in a location (the “shares”). 

In the immigration literature, a similar example is to look at the impact of immigration flows into location l (e.g. a U.S. local labor market) on the change in native wages in that location:

Change in log native wages (l,t) = a + b*Immigration(l,t)  + e(l,t)

And again we are concerned that immigration is endogenous, and the Bartik instrument is then

Where here the z are the lagged or “initial” distribution of the share of immigrants from source country j in location l, and m is the normalized change in immigration from country j into the U.S. as a whole. The predicted inflow of migrants into a destination is then a weighted average of the national inflow rates from each country (“the shift”), with weights depending on the initial distribution of immigrants (“the shares”).

The Goldsmith-Pinkham et al. (2018) results
Typically, when someone presents an instrumental variable, they spend a lot of time motivating and defending the exclusion restriction. But this becomes a bit harder to see and do with the Bartik instrument, since it is this weighted average of many different shifts (e.g. 397 different industries and multiple time periods in the trade example).

A first result in Goldsmith-Pinkham et al. is to show that the Bartik instruments are numerically equivalent to using the initial shares (interacted with time fixed effects when multiple periods are used) as instruments in a weighted GMM estimation – the shifts only provide the weights and affect the instrument relevance, but not the endogeneity. They walk through a couple of simple examples to illustrate this in a two-industry or single time-period case, before proving for the general case. This means you need to argue why the initial shares are exogenous.

Second, they show that this estimator can be decomposed into a weighted combination of just-identified estimates, each using a single share as an instrument. More importantly, the weights on these instruments (which they call Rotemberg weights), sum to one, with higher weights indicating that this instrument accounts for a higher share of the identifying variation. This is important since, in practice, a small number of industries or source countries often account for a large share of the identifying variation. This makes clearer which shares, in particular, one needs to argue that are exogenous. In the trade example, out of 794 instruments, the authors show the top 5 account for one-third of the absolute weight in the estimation – the industry shares in electronic computers, games and toys, household audio and video, telephone apparatus, and computer equipment.  So basically this estimation relies on comparing differences in outcomes (from 2000 to 2007) in places with high and low shares of these industries in the 1990.

While you can never verify that the exclusion restriction holds for an instrument, this at least provides a couple of places to look for how plausible the exogeneity restriction seems:
  • You can see how much the initial shares that account for much of the key variation are correlated with other potential confounders in the initial year. E.g. in the trade example, one sees that areas that initially had a lot of computer manufacturing were also more educated.
  • Because there are lots of instruments getting weighted together, an overidentification test is possible – under the null of constant treatment effects, rejection implies that some of the instruments are endogenous. The authors note the importance of using a test that is sensitive to weak instruments in doing this, given the large number of instruments.
Helpfully, the authors have created Stata code (and R code) to calculate the Rotemberg weights and example datasets to try it out.

The Jaeger et al. critique and suggested approach
Goldsmith-Pinkham et al. note that two key assumptions in their work are that locations are independent (so there are no spatial spillovers), and that the data consist of a series of steady-states. Jaeger et al. (2018) provide a critique of this second assumption with regard to looking at the effects of immigration, but their point also applies more generally to other settings where adjustment dynamics can be important. Consider again the immigration on wages equation:

Change in log native wages (l,t) = a + b*Immigration(l,t)  + e(l,t)

The standard concern is that there are contemporaneous factors (e.g. local demand shocks) that affect both local native wages and how many immigrants move in. The Bartik instrument is meant to be exogenous to these local demand shocks. However, the main idea in Jaeger et al., is that, if it takes time for markets to adjust to shocks, then the error term e(l,t) can also include other terms, which reflect the ongoing general equilibrium adjustment effects of past immigrant supply shocks (e.g. capital adjustment). The result is that the Bartik instrument will conflate the short-term response (e.g. a fall in wages when new immigrants enter) and the long-term response (e.g. a positive move back as capital has time to adjust). 

Their suggested solution is to control for these dynamics by adding lagged immigrant flows to this regression, and also instrumenting for this with the analogous Bartik instrument:

Change in log native wages (l,t) = a + b*Immigration(l,t) +c*Immigration(l,t-1) + e(l,t)

Where two Bartik instruments are then used:

Then b will capture the short-run effect, and c captures the longer-term reaction to past supply shocks. However, for this to work, we need there to be independent variation in the two periods in where migrants are coming from. This can be a tall order, since the country of origin mix of the flow of immigrants to the U.S. is so similar over time – since the 1980s the correlation across metro areas between the instrument and its lag is 0.96 to 0.99. The result is that Bartik instruments are usually going to be strongly serially correlated and not provide enough variation to separately identify dynamics. The authors suggest such a strategy worked better in the 1970s vs 1980s in the U.S., due to policy changes and other shocks dramatically changing the immigrant composition then, whereas since the 1980s, country of origin is too highly correlated. They also suggest it might work better in European countries where immigrant flows have been less stable over time.

Should friends let friends shift-share?
There is a reason that these instruments have become so popular – the questions they are trying to answer are ones with high policy interest, and people have struggled to find other convincing instruments. Nevertheless, these two new papers make much clearer what the identifying assumptions underlying the Bartik instrument are, and so those planning to use these instruments will have to work hard to convince readers that their key initial shares are exogenous, and that they have appropriately considered adjustment dynamics.

 

Comments

Submitted by Paul Christian on

Food aid and conflict

Thanks for the helpful summary, David. Last year Chris Barrett and I wrote a paper showing issues in identification of these models in the context of the question of whether food aid causes conflict. We showed that issues arise when the bartik/shift-share/interacted instrument treats trending variables as though each observation is an independent draw. This causes problems both for inference and risk of omitted variables bias or spurious correlation. Interacting a potential endogenous variable with a time series instrument doesn’t really improve the argument for identification. When the trends are non-linear, many of the usual attempts to diagnose or correct problems don’t work well. Our general take-away is that we all need to be much more careful in evaluating where variation comes from rather than taking it for granted that IV approaches are better identified than the OLS models.

Submitted by David Jaeger on

Paul, your paper with Chris Barrett is great, and I completely agree with you. Ted Joyce and Bob Kaestner and I make some similar arguments in an entirely different context in our re-examination of Kearney and Levine's paper on 16 and Pregnant:

http://www.djaeger.org/research/wp/JJK_Sixteen_and_Pregnant_JBES_Revision.pdf

I think people have relied for far too long on these kinds of instruments without thinking clearly about the assumptions necessary for identification.

Submitted by Tim Bartik on

I am concerned that some readers of your blog post may exaggerate the problems with the so-called “Bartik instrument” (not a name I coined!) as a way of identifying local labor demand shocks.

1. The Bartik IV uses national-growth-weighted industry shares as an instrument, not the industry shares themselves.

2. These weights are not arbitrary but are derived from a model of regional labor demand in which, other things equal, local export-base industries tend to maintain market share due to changes in national labor demand (see Appendix 4.2 of my 1991 book, now available for free online: http://research.upjohn.org/up_press/77/ ).

3. You will not get anything close to the same estimates from using the Bartik IV, versus directly using the industry shares as instruments. For example, see the recent paper by Van Dijk, in the Journal of Regional Science. https://onlinelibrary.wiley.com/doi/abs/10.1111/jors.12378 He compares Bartik IV estimates with industry-share-instrument estimates (his “Regress-M” estimates). The Bartik IV estimates look like demand shock estimates, the industry share estimates do not.

4. It may be a SUFFICIENT condition for identifying demand shocks for all the industry shares and national industry growth rates to be uncorrelated with any local variable that is potentially correlated with local labor supply shocks. But it is not a NECESSARY condition. In other words, even if one or more local industry share is correlated with some variable that is related to local labor supply shocks, such as local immigrant shares, the Bartik IV need not be so correlated. The Bartik IV does NOT require local industry shares to all be exogenous.

5. In many real data sets one might use, it is quite plausible that the Bartik IV might be uncorrelated with local labor supply shocks even if the local industry shares have some correlation with variables related to local supply shocks. For example, using annual data, national growth shocks tend to be volatile and might often be negatively correlated over time for the same industry, whereas supply shocks tend to be more gradual and more long-term.

6. For empirical practice, the most important issue is whether using the Bartik IV is less biased than using OLS. If we are regressing the change in local labor market outcomes on local job growth, we know that OLS will be biased by a wide variety of local labor supply shocks. These local labor supply shocks are not only due to immigration, but also due to changes in amenities, demographics, education, zoning rules, housing codes, and transportation infrastructure. The relevant issue is whether the Bartik IV is less biased by local labor supply shocks than is true for OLS.

7. Over-identification tests or other procedures that lead to weighting the industry shares based on purely econometric criteria are a bad idea, compared to weights for the industry shares that are based on some model of what drives regional labor demand, which is the case for the Bartik IV. Non-theory-based weights are a bad idea in part because industry shares may not all be completely uncorrelated with supply shocks. Theory-based weights that help identify demand shocks are a feature not a bug.

8. The first stage can be examined to see whether the Bartik IV estimates seem plausibly to be reasonably interpreted as identifying the multiplier effects of shocks to export-base industries, based on the size and variation over time in the estimated effects. For example, in my 1991 book, I spent some time examining the first stage to see if the estimates seem to be uncovering reasonable multiplier effects. (Appendix 4.2 again).

Add new comment