­Working in Nairobi on Kenya and South Sudan, my colleagues and I spend much of our time worrying about implementation. There are lots of ideas for tackling poverty, many supported by rigorous small-scale evaluations (and many that are not). The hard part is making those ideas work in practice and on a large scale through governments.
The challenge of implementation is illustrated by an intriguing paper I saw presented a couple weeks ago at the Annual Conference of the Centre for the Study of African Economies (CSAE) at Oxford University. The paper, by Tessa Bold and co-authors, examines a randomized control trial of contract teachers in Kenya.
In previous RCTs in India and Western Kenya, hiring contract teachers has been shown to increase improve test scores among primary school students. The main innovation in Bold et al. is that they try two different forms of implementation of the same intervention. A total of 192 schools from across Kenya were split at random into three groups: 64 were assigned to the control group, 64 to receive a contract teacher through the Ministry of  Education, and 64 to receive a contract teacher under the coordination of a local affiliate of an international NGO, World Vision Kenya. Recruitment procedures, salary, and experimental protocols were held constant across the two treatment arms.
The key result is seen in the figure below (taken from the CSAE presentation by Justin Sandefur, one of the co-authors). It shows the intention-to-treat (ITT) estimate of the impact on a school’s average test scores of being assigned to receive a contract teacher. The leftmost bar (“Overall”) shows that among all schools combined, the treatment raised test scores by 0.08 standard deviation. The remaining two bars show the effects estimated separately for the two treatment groups. The effect for schools where the contract teacher was administered by the NGO was 0.18 standard deviation, while for schools where it was implemented by the Ministry of Education (MOE), the effect was negative and statistically indistinguishable from zero. This means that the entire effect for the overall sample is driven by the impact at NGO-treated schools. The effect on test scores where the intervention was run by the NGO is similar to what other RCTs elsewhere have shown.
So why did the intervention succeed in the NGO-treated schools and achieve nothing in the MOE-treated schools? At NGO-treated schools, contract teacher vacancies were filled more quickly and there were shorter delays in the payment of the teachers’ salaries. As a result the average period of actual exposure to the contract teacher was shorter in the MOE-treated schools vs. the NGO-treated schools. This shorter exposure period does not, however, explain the difference in the ITT estimates. We know this because the same difference in treatment effects (positive for NGO, about zero for MOE) is seen in the average treatment effects for the treated (ATT) estimates, which take into account the difference in exposure periods.
The differences in implementation along observed lines suggest that they also differed along unobserved lines. The NGO may have been more careful in monitoring the teachers to ensure that they were actually present in school. Or it may have been that the delays in payments at MOE-treated schools made teachers disgruntled and less likely to exert effort. Another possibility is that there was a difference in quality in teachers in the two treatment arms.  The authors explore these and a few other hypotheses without coming to definite conclusions as to the mechanism. Although they can’t nail it down, it is likely that poor implementation in some way must explain the disappointing result at the MOE-treated schools.
I expect this paper will be a sort of Rorschach test for views on RCTs and service delivery in developing countries. Evaluation skeptics may try to cite this as evidence that RCTs are a waste of time, since it suggests that successful interventions implemented by NGOs, as they often are in experiments, may not be replicated at scale by governments. Others might take the paper to indicate that NGOs should be the preferred vehicle for interventions. I think these readings would be mistaken, and I take two reflections from the paper. First, we should do many more rigorous studies working with governments where we vary forms of service delivery to better understand what can work in practice. Second, the World Bank’s approach to public services—the long, difficult slog of working to improve government systems—is the right one, because it’s the only way to ultimately make services work for the poor at large scale.


