A Framework for Taking Evidence from One Location to Another


This page in:

Also available in: Français

“Just because it worked in Brazil doesn’t mean it will work in Burundi.” That’s true. And hopefully obvious. But some version of this critique continues to be leveled at researchers who carry out impact evaluations around the world. Institutions vary. Levels of education vary. Cultures vary. So no, an effective program to empower girls in Uganda might not be effective in Tanzania.

Of course, policymakers get this. As Markus Goldstein put it, “Policy makers are generally not morons.   They are acutely aware of the contexts in which they operate and they generally don’t copy a program verbatim. Instead, they usually take lessons about what worked and how it worked and adapt them to their situation.”

In the latest Stanford Social Innovation Review, Mary Ann Bates and Rachel Glennerster from J-PAL propose a four-step strategy to help policy makers through that process of appropriate adaptation of results from one context to another.

Step 1: What is the disaggregated theory behind the program?

Step 2: Do the local conditions hold for that theory to apply?

Step 3: How strong is the evidence for the required general behavioral change?

Step 4: What is the evidence that the implementation process can be carried out well?

Let’s walk through one example that they provided and then one of my own. In India, an intervention that provided raw lentils each time parents brought their children for immunization and then a set of metal plates for completing all needed vaccinations (at least 5 visits) had a dramatic impact: Completed immunizations jumped from 6 percent in comparison communities to 39 percent in incentive communities. As J-PAL works with several governments to boost immunization rates, they follow the 4 steps:
  1. Theory behind original program: Parent do not oppose vaccination, but they are sensitive to small price changes.
  2. Local conditions in the new context: How do parents feel about vaccination in Sierra Leone or in Pakistan? Here, they propose a suggestive piece of evidence: If parents take their children for at least one immunization but then fail to get the rest, it suggests they aren’t opposed to vaccination per se but rather may be sensitive to transport or other costs. This seems to be the case in both new settings.
  3. Evidence on changing behavior: There is lots of evidence that people underinvest in preventive care and that they are sensitive to price gaps.
  4. Possibility of Implementation: Could Sierra Leone or Pakistan pull off this kind of incentive program? Bates’ and Glennerster’s take: In Pakistan, probably yes because of the deep penetration of mobile money. But in Sierra Leone, it remains to be tested whether incentives could be effectively delivered to parents via the clinics.
Okay, so having seen their examples, let me work through one of my own. Consider a recent education evaluation of technology-aided instruction in urban India (paper; blog post). The program had large impacts on math and Hindi test scores. How well would this likely translate to rural Tanzania?
  1. Theory behind the India study: There are students of wildly different ability in a given class, and most students are behind the curriculum. It’s difficult or impossible for teachers to reach all student levels in one class, and students learn more effectively when they receive instruction at their level. (In the case of the India study, the technology facilitates that.)
  2. Local conditions in the new context: In Tanzania – especially rural Tanzania – most students are far behind the curriculum. As the latest Uwezo report for Tanzania tells us: “Only one in four children in Standard 3 can read a Standard 2 story in Kiswahili. … Four out of ten children in Standard 3 are able to do multiplication at Standard 2 level.” The same study shows large variation by grade:
Children’s reading ability in Kiswahili by grade

Source: Uwezo 2013
  1. Evidence for behavior change: A growing collection of studies point to the returns to interventions that help learners receive instruction at their level. Evidence from splitting classes by ability in Kenya, re-organizing classes by ability for an hour a day in India, providing teaching assistants to help the lowest performers in Ghana, and others. So there is broad evidence that students respond to this type of intervention.
  2. Possibility of implementation: I’m skeptical of the possibility of effectively scaling dedicated computer centers in rural Tanzania right now. The original program in India was in urban Delhi, and a previous, successful technology-aided instruction program in India was also in a city. I’m happy to be proven wrong, but I suspect that Step 4 is where adaptation would stop making sense in this case.
As Bates & Glennerster point out for one of their examples, stopping at Step 4 doesn’t mean “the original study does not generalize. All we will have found is that the local implementation failed, not that the underlying behavioral response to incentives was different.” In this case, the process of working through Step 2 (yes, Tanzania faces the same problem) and Step 3 (yes, there is lots of evidence on how to change this) would point us to starting over with one of the other “teaching at the right level” interventions that might be effectively implemented in rural Tanzania.

Evidence from one context can be clearly be useful in other contexts. But translation isn’t automatic. This four-step framework can be useful in figuring out how likely that translation is to work.
Bonus reading: Every regular Development Impact contributor has written on this issue. Here are a few of them: You can find them all here.


David Evans

Senior Fellow, Center for Global Development

Join the Conversation

Varja Lipovsek
May 30, 2017

Thanks David for this note -- you are right, many people have written on this subject, yet it is good that we keep on writing about it. Because while it's definitely true that "most policy makers are not morons" it is also true that very many really don't know how to use or interpret evidence and therefore, yes, many times an idea which was proven to work in context X is more or less just plonked down in context Y. (Take Kenya's "one laptop per child" folly.) Moreover, it is also true that very many (though by no means all) researchers are happy to keep copy-and-pasting initiatives from one context to another, simply because initiative X has not yet been tried in context Y, and the ensuing results will therefore fill a "knowledge gap."
What is rare to come by is an open (and trusting, and sustained) conversation between the researchers and policy makers and, moreover, policy implementers (not one and the same) on the kind of rationalizing and questioning that you point out. Sometimes intermediary organizations like Twaweza (where I am writing from) can help to play that translator / mediator role.
And speaking of Tanzania (hello from Dar), you might be interested to learn that the government education sector continues to spend copious amounts of funding on initiatives that have been proven to be ineffective just about everywhere they have been studies (as far as learning outcomes go) -- e.g., building laboratories, stocking schools full of plastic desks, printing new textbooks out of sync with learner pace and with mistakes. We have some ways to go, indeed.
And last but not least, Uwezo (part of Twaweza) has released a new report on the state of learning (you cite 2013 report); the report covering every single district in Tanzania in 2015 can be found here http://twaweza.or.tz/go/uwezo-tanzania-2015-ala
Data are freely accessible as well - in case you or others want to have a go at it.
Varja Lipovsek
(Director Learning, Monitoring & Evaluation at Twaweza East Africa)

David Evans
June 02, 2017

Thanks so much for your comment, Varja. Thanks for highlighting the newer Uwezo report. (It's unfortunately not listed on the Uwezo "reports" page, which is why I didn't see it before.) 
You bring up several important points. Notably, the issue that governments continue to invest in areas that have not been effective elsewhere. Maybe the framework should have a Step 0: Did the program actually work somewhere else? That's not to say governments shouldn't innovate; they absolutely should! But innovating is not the same as re-trying something that was ineffective in other locales. 

Matthew Jukes
June 02, 2017

Thanks for the blog Dave (and hello Varja).
What you (and Mary Ann and Rachel) say about the role of theory in understanding external validity is spot on. The problem is that so many experimental evaluations focus only on generating a point estimate and not on developing an understanding of *how* the program works. Blattman makes this point more extensively: http://chrisblattman.com/2016/07/19/14411/
I thought about this a lot when we evaluated the HALI literacy project in Kenya (http://bit.ly/2qHMQcY) . We spent a lot of time diagnosing the contextual instructional issues we were trying to address (and wrote a separate paper on this - http://bit.ly/2swc0vZ) and spent a lot of time assessing whether these instructional issues were indeed addressed by the intervention (http://bit.ly/2eytlzN). This focus on context and mechanisms is critical for making claims about external validity – our conclusion isn’t ‘our project worked – try it in your context too!’ but rather something like: ‘if your context is the same as ours (teachers don’t focus on text and don’t break words down into sounds and syllables) then the key to improved learning is to get teachers to change these behaviours. Here’s how we did it.’ I think this is more usable information and general furthers our understanding of how to improve literacy instruction.
I’d love to see more evaluations focus seriously on context and mechanisms in this way. But maybe my repeated use of the phrase ‘we spent a lot of time’ perhaps explains why they don’t.
Matthew Jukes
Sr Education Evaluation Specialist

David Evans
June 02, 2017

Thanks, Matthew. I think that work you highlight is a great example of looking at all of these elements, and I hope that more evaluations pull it off. But time and funding both enter in, as well as the ability to assemble and manage an interdisciplinary team effectively. More examples will perhaps help guide future researchers.