Published on Development Impact

Some thoughts on the Give Directly Impact Evaluation

This page in:
On Friday, the first evaluation of Give Directly was released, and covered in the Economist and NPR among others. As discussed in a previous blogpost, Give Directly makes unconditional cash transfers to households in Kenya who are targeted on the basis of whether or not they live in a thatch-roofed house in Kenya. The first findings look at the short-term impact of these grants, either when households are still receiving them, or within a few months of having received them.

The basic findings are encouraging from the point of view of charitable donations – if you want to feel assured that money given to Give Directly is immediately having a positive effect on the lives of poor people, this evaluation has plenty for you: households upgrade their thatch roofs to metal, build up livestock, consume more food, increase small business income, and are happier and less stressed. And for those whose views about how the poor spend money are shaped by stereotypes of the homeless in the U.S., there isn’t any significant increase in spending on alcohol, tobacco or gambling. So there is  much more evidence of positive effects than for the vast majority of charities out there.

Of course for many people, the fact that giving people a bunch of money today (US$300 for most receipients, which is about 2 months of household non-durable consumption ) leads to households being better off today is not really surprising, and so their questions turn to three things: a) Are these effects long-lasting? b) does this have negative effects on those neighbors who don’t receive the grants; and c) when all this money goes into small villages, doesn’t it just push up prices. What does the evaluation have to say about this? On a), obviously we need to wait for the long-term follow-up studies to really know, but there is a bit either way here – households are building some business assets, but there aren’t any significant effects in the health and education investment domains (they also have a little on short-term dynamics); on b) and c), the study is designed to measure these spillovers, randomizing first at the village level to have some pure control villages, and then within villages as to who is treated. They don’t find any evidence of negative spillovers on neighbors, and find no significant changes in village prices, wages or crime.

Things of interest (and possible concern) to impact evaluators
The evaluation is carried out by Johannes Haushofer and Jeremy Shapiro, and does a lot of things really well:
 
  • They are careful to note that Jeremy is a co-founder and former director of Give Directly – such disclosure is important when people who have a vested in the results are involved in the evaluation.
  • The evaluation design and pre-analysis plan are pre-registered at the AEA registry. This includes procedures to deal with testing impacts on multiple outcomes.
  • Rather than just asking whether the program is working or not, they use the evaluation to try and test different ways of doing it, so that the evaluation can help guide making it better. To this end they i) randomize whether the money is given to the husband vs the wife; ii) randomize whether individuals get the money as a single lump-sum payment vs as a stream of monthly payments for 9 months; and iii) test sensitivity to the level of the transfer by giving one group approximately $300 (small transfer) and another $1100 (large transfer). This leads to lots of different treatment groups as shown below:ImageSo what might we be concerned about?
     
  • Self-reporting:  This is an issue for every study I know of which has consumption as one of its main outcomes. A separate survey team did the interviewing from the intervention organization, but it is likely the two were linked in many minds. The concern is of course that after money has just dropped out of the sky onto some households that the treated households may feel like they should over-report well-being and consumption and underreport uses they think might be not approved of, while the control group might have incentives to underreport consumption in the hopes of receiving transfers in the future. I feel like this concern is larger when surveys take place close in time to the grants (as here), and when the grant is given as a charity transfer/gift rather than as a prize or other windfall. One possible solution is to increase the amount of objective measures – they take salvia samples here to measure cortisol, a stress indicator, as well as take anthropometric measures, but don’t see much impact on these despite reported food intakes increasing. In future follow-ups perhaps they can physically verify certain asset holdings (certainly the roof, maybe also cattle) to make these less susceptible to reporting bias.
  • The other (complementary) approach is to give reasons to think this may not be so important: Johannes notes a couple of these reasons in an email to me: i) you might think people would over-report health and education spending since they would think donors like this, but the lack of any impact here suggests this isn’t dramatic (of course it is also consistent with the program having a negative impact on health and education spending and overreporting taking this to a nil effect, but this seems less likely); ii) the large transfers were a surprise to households and came later than everything else being announced, but villages which got more surprise large transfers don’t have differential reporting.
     
  • Power: The downside of trying to test so many different variants of the program is that the power to distinguish between them can be somewhat low. As seen above, they only have 500 treatment households, but effectively have 3 types of treatment (so 8 combinations). A first concern is whether there are interactions between the treatments, something that is not tested. Second, even assuming linearity, the power can be somewhat low, especially for outcomes like expenditure, which can be noisy. For example, the main treatment effect on non-durable expenditure is $US36, with a standard error of $6. The female recipient effect is -$2, with a standard error of $10. So they can’t rule out that a female head has less than half the size effect as a male head, nor that a female head has 50% higher treatment effect than a male head. Similar wide confidence intervals are found for business revenues and non-land assets – as a result, the studies are not as informative as one would like about design choices because they are perhaps trying to do too much.
Bottom line though is that this is a very well-designed study, and we would love love love to see more charitable programs (and government programs!) evaluated to the same standard of rigor. I look forward to seeing the longer term follow-up results.
 
  • Update: Since writing this on Friday, I see Chris Blattman makes some similar points about self-reporting.
     

Authors

David McKenzie

Lead Economist, Development Research Group, World Bank

Join the Conversation

The content of this field is kept private and will not be shown publicly
Remaining characters: 1000