Published on Development Impact

What’s in a title? Signaling external validity through paper titles in development economics

David Evans

February 17, 2016

This page in:

External validity is a recurring concern in impact evaluation: How applicable is what I learn in Benin or in Pakistan to some other country? There are a host of important technical issues around external validity, but at some level, policy makers and technocrats in Country A examine the evidence from Country B and think about how likely it is to apply in Country A. But how likely are they to consider the evidence from Country B in the first place?

Development economists sometimes try to signal the external validity of their work by how they frame the evidence they present. For example, “Strengthening State Capabilities: The Role of Financial Incentives in the Call to Public Service” (anywhere!) sounds more generally applicable than “The Political Economy of Deforestation in the Tropics” (tropical countries only!) which in turn sounds more general than “Education and Human Capital Externalities: Evidence from Colonial Benin” (Benin … 150 years ago!).

To characterize the norms in this area, I drew a sample of 450+ papers across 6 journals that publish applied economic development research between 2010 and 2015 to see how common it is for authors to frame their evidence as general versus country-specific. Specifically, I examined empirical development papers that use evidence from one or two countries, with at least one of them being a low- or middle-income country (as of 2010). To get a range of publications, I looked at three general interest journals – the Quarterly Journal of Economics (ranked #1 among economics journals by simple impact factor), the American Economic Review (#10), and American Economic Journal – Applied Economics (#31) – and three development field journals – the Journal of Development Economics (#36), Economic Development and Cultural Change (#132), and World Development (#136). For the general interest journals, I used the universe of applied development articles (mid-2010 to mid-2015); for the field journals, I drew a sample from the same period.

Fact 1: The broad norm is to include the country of evidence in the title. More than two-thirds of articles (69%) do this. I believe in a degree of external validity (i.e., we can learn across contexts), but I also put weight on local evidence. Signaling the source of the evidence in the title of the paper is one way to make it easier for people to find local evidence.

But it may not be the best way to reach the broadest academic readership, per Fact 2.

Fact 2: Papers are much more likely to identify the country in field journals than in general interest journals. In fact, the higher ranked the journal, the less likely it is that the country is mentioned in the title.

Of course, this association does not indicate that including the country in the title has a causal impact on journal placement. Rather, it may well be that articles of more general interest are BOTH less likely to mention the country in the title AND more likely to get published in the top ranked journals.

Fact 3: There is no simple relationship between income and country. Lower middle income countries are slightly more likely to have the country in the title than low or upper middle income countries, but as we’ll see next, this may well just be the China-India effect, as both fell into the middle group.

Fact 4: If the evidence is from the most populous countries (China and India), then authors do identify the country.
This is consistent with work by Das et al. showing particularly high research production and – potentially – interest in these countries: “The first-tier journals together published 39 papers on India, 65 papers on China, and 34 papers on all of Sub-Saharan Africa.”

Fact 5: Only a few papers identify the source of the data in neither the title nor the abstract. The vast majority of papers, if they don’t have the country in the title, identify the source of the data in the abstract.

These authors are implicitly making a strong argument that the source of the data is irrelevant. For example, when authors present a model of technological learning and test it with a field experiment, but don’t reference the country, it is implicit that the results aren’t specific to Indonesia. Likewise, when a paper examines the relative roles of motivation, training, and knowledge in health care provision but omits the country of study from the title and abstract, this suggests that it doesn’t matter that this took place in Tanzania.

Of course, in both cases it probably does matter. Seaweed farmers in Indonesia may learn differently than sorghum farmers in Kenya, and health workers in Bolivia may have different weights on motivation versus knowledge. Not including the country of study even in the abstract seems to unnecessarily tax those who believe that context matters.

Conclusion: Obviously, the title is just one way that authors signal the general interest of their evidence. They also do so through argument and data in the abstract and throughout the paper. Although articles in top journals are less likely to have the country name in the title, note that even in those journals, more than half of applied development articles do so.
Authors can reference the source of the evidence and still publish well.

Bonus: Do economists think Africa is a country? For the most part, no. Out of 127 articles in the sample with applied work in a country of Sub-Saharan Africa, only 3 use evidence from a single country to stand in for Africa as a whole.

Get updates from Development Impact

Authors

David Evans

Senior Fellow, Center for Global Development

More Blogs By David

Join the Conversation

The content of this field is kept private and will not be shown publicly

Remaining characters: 1000

I have read the Privacy Notice and consent to my personal data being processed, to the extent necessary, to submit my comment for moderation. I also consent to having my name published.

Jörg Peters

February 24, 2016

Test of a theory vs. external validity Nice piece! We have done a systematic review of RCTs published in top econ journals to check how they deal with external validity issues – with somewhat complementary findings: most papers do not really discuss external validity potentials and limitations. Here is the discussion paper: http://en.rwi-essen.de/publikationen/ruhr-economic-papers/731/ We have been... in touch with many of the authors and what some of them told us was that our assumption is slightly wrong. We assume (and so do you in this blog, I guess) that RCTs are designed in country X to inform policy - not only in country X, but also in country Y. This is not necessarily true, so the argument goes, because many RCTs published in econ journals are not done to inform policy but to test a theory (“proof-of-a-concept”). I see their point, but I am wondering if this is really an entirely valid argument. My impression is that most published RCTs in fact do want to inform policy - in country X, but also in country Y. And if a paper really only aims at a test of a theory, it should also be clearly said in the paper. Otherwise, the pitiful ambitious evidence-based policy maker takes the results at face value and applies them in country Y… I dare to say, however, that in many cases a mere “test of a theory” claim would not be enough to get into a general interest journal. This matches your observation that papers in field journals are more inclined to identify the country in field journals than in general interest journals. In a very bold way of course.

M Chrisney

March 02, 2016

4 questions for policymakers to consider Internal validity is often gained at the expense of wider policy relevance -- or external validity. A paper by Peters, et al reviews some of these limitations, and here I frame them as questions that should be foremost in the minds of practitioners and policymakers when assessing a RCT-based policy prescription. 1. Is the setting for the experiment similar to the setting for implementing the policy? This is critically important for efforts to extend policies within a country where regional, linguistic, and ethnic differences are present and is certainly relevant when transplanting ideas across countries. This is a problem of transferability. 2. Could scaling up a pilot program have unintended negative effects? Unlike immunization programs where scaling up often has benefits, such as reducing the channels for transmission of communicable disease, a market-based experiment can have negative effects at scale. For example, while fertilizer subsidies may work in a small-scale, controlled experiment, when used at scale they may lead to excess demand and higher prices that offset the value of the subsidy. This is the problem of scaling up. 3. Can we replicate the conditions needed for success? While the experimental case has the benefit of personalized attention from the implementing team and dotting academics (what might be called: First Class Treatment), the scaled-up version may be carried out less attentively by a large bureaucratic organization (Tourist Class Treatment). This difference can undermine the apparent cause-effect linkage of the policy/program. 4. How widespread are the benefits? It should be noted that RCTS for the most part capture the average effects of a policy, so there may be a wide variation in actual outcomes for individuals. A few may benefit greatly, while many do not. This can affect the political economy of implementing a policy based on RCT evidence. Furthermore, we know that everyone acts differently when the camera is on and the same can happen with participants in social science trials. Behavioral changes can occur that bias the results among those “treated” (Hawthorn effect) and/or among the “non-treated” or control group (John Henry effect). Whether RCTs are a reliable source of cross-country policy advice is still an open question. In part, it will depend on how well these studies are framed to answer the issue of “scalability” and the attention given to justifying any claims of external validity. As always Caveat Emptor (Policymaker) @mchrisney

4 questions for policymakers to consider Internal validity is often gained at the expense of wider policy relevance -- or external validity. A paper by Peters, et al reviews some of these limitations, and here I frame them as questions that should be foremost in the minds of practitioners and policymakers when assessing a RCT-based policy prescription. 1. Is the setting for the experiment similar... to the setting for implementing the policy? This is critically important for efforts to extend policies within a country where regional, linguistic, and ethnic differences are present and is certainly relevant when transplanting ideas across countries. This is a problem of transferability. 2. Could scaling up a pilot program have unintended negative effects? Unlike immunization programs where scaling up often has benefits, such as reducing the channels for transmission of communicable disease, a market-based experiment can have negative effects at scale. For example, while fertilizer subsidies may work in a small-scale, controlled experiment, when used at scale they may lead to excess demand and higher prices that offset the value of the subsidy. This is the problem of scaling up. 3. Can we replicate the conditions needed for success? While the experimental case has the benefit of personalized attention from the implementing team and dotting academics (what might be called: First Class Treatment), the scaled-up version may be carried out less attentively by a large bureaucratic organization (Tourist Class Treatment). This difference can undermine the apparent cause-effect linkage of the policy/program. 4. How widespread are the benefits? It should be noted that RCTS for the most part capture the average effects of a policy, so there may be a wide variation in actual outcomes for individuals. A few may benefit greatly, while many do not. This can affect the political economy of implementing a policy based on RCT evidence. Furthermore, we know that everyone acts differently when the camera is on and the same can happen with participants in social science trials. Behavioral changes can occur that bias the results among those “treated” (Hawthorn effect) and/or among the “non-treated” or control group (John Henry effect). Whether RCTs are a reliable source of cross-country policy advice is still an open question. In part, it will depend on how well these studies are framed to answer the issue of “scalability” and the attention given to justifying any claims of external validity. As always Caveat Emptor (Policymaker) @mchrisney