Last week, my unit at the Bank organized a workshop on Cost Analysis for Interventions in Human Development. No – this wasn’t a ploy to gather a bunch of accountants in one place to see how many it would really take to change a light bulb. It was meant to start a conversation on how to combine costs with the many, now very credible, impacts that are coming out of the evaluation world these days, to see if what works is really worth investing in, and to graduate from just getting hyper about identification, statistical significance, and effect sizes to thinking about what these estimates could (or should) mean for resource allocation. While making progress on this should be easier than, say, splitting the atom, it turns out that we currently know very little about accurately calculating the costs of our interventions in education, health, and social protection, nor have we reached much consensus on how to combine these cost measures once we have them with our estimated impacts.
On measuring costs, we just don’t have as much practice on this as we do for collecting data on impacts. Right now, most of us retrospectively ask about costs that occurred over the life of the project. If project records were reliable and inputs were solely reserved for the project, this shouldn’t pose a problem. Typically, though, our records for expenditures are spotty and many inputs require people’s time. Getting at time investments is tricky, though. Not only should we worry about recall periods here, but we also need to think about pricing the time of volunteer labor and the time of government or NGO workers who also have duties outside of our project. The same applies for infrastructure or equipment that has other uses.
To combine whatever costs we end up measuring with impacts, we do have some tools, and some do find their way into the closing paragraphs of our papers (for a very clear introduction to these methods, please see Patrick McEwan’s recent review paper ). Cost-benefit analysis attempts to translate impacts into a monetary value that can be directly compared with the costs associated with a particular intervention - for example, converting increases in years of schooling or achievement into increases in lifetime earnings, as done in Mexico (Schultz, 2004 ), and Colombia (Angrist et al, 2002 ). Cost-effectiveness analysis, on the other hand, calculates the total amount of impact we could get for a given amount of spending - e.g. 6 years of education for every $100 spent - and typically is useful only for comparing multiple interventions (see J-PAL’s paper  on this for charts that compare a number of interventions on a single dimension). Cost-utility analysis, a bit of a hybrid of the other two methods, attempts to translate impacts, such as averted diarrheal episodes or reductions in the risk of a heart attack, into a common but non-monetary scale and to calculate how much we can buy for a given amount of spending – e.g. 3 disability-adjusted life years for every $100 of spending ( see Kremer et al, 2011  for an example from a water treatment intervention in Kenya).
So why aren’t we using these tools more either for analysis or for policy dialog? I see at least three issues here. First, our inner purist that has spent so much time thinking about identification probably recoils at the thought of contaminating our clean impact estimates with the messier assumptions required for costing analysis. The cost-benefit analyses of improvements in education, for example, often rely on Mincerian wage regressions to monetize the value of schooling. Cost-effectiveness analysis also requires us to standardize impacts and costs across different interventions, which often involves assumptions for translating impacts into a common measure and for adjusting exchange rates for differences in purchasing power. When inputs are jointly used by various programs, estimating the fraction of their costs that can be assigned to your intervention is often guesswork.
Second, the issue of external validity becomes even more salient in the context of costs. Not only do costs vary wildly across settings (with a fixed transport budget, think about how many meters you can drive in the Africa region for every kilometer in India), but we should also expect both impacts and costs to vary between one-off projects and scaled-up programs within the same country. While scaled-up interventions may be able to reap economies of scale, implementation (and thus impact) might also suffer in a less controlled environment.
Finally, there are other issues that we still haven’t resolved on the impact side that shape the usefulness of our value-for-money estimates. Take the distribution of impact, for example, and the well-cited figure of cost-effectiveness of deworming medication from Miguel and Kremer’s 2004 study in Kenya, where $3.50 buys you an additional year of schooling. Since the medication cost only $0.50 per child to administer and since we don’t expect more spending on medicine to amplify impact for a single child, this figure could be a bit misleading for a policymaker that doesn’t know about the average effect of the program. In particular, what this figure doesn’t tell us is that this additional year of schooling that we can buy for $3.50 is really the sum of schooling increases across at least 7 children. Of course, we can always include this information when comparing programs, but then we’ve just attached another attribute for a policymaker to think about. Things get even more complicated when trying to determine the value-for-money we get when an intervention generates different impacts on different outcomes of interest (for example, Karlan and Zinman  find that expanded access to credit both increased incomes and decreased mental health in South Africa).
So how can we make progress on this? One thing that is clear is that we can no longer ignore costs when analyzing impact, unless we are content with assuming a world with no budget constraints or opportunity costs. We don’t live in that kind of world, however, and thus impacts in the absence of cost data don’t give us much guidance for translating evidence into policy (unless we learn that something doesn’t work at all). More evidence from impact evaluations, particularly from longer-term follow-ups that examine the final outcomes we’re interested in, will certainly help us do cost-benefit analysis that even a purist can be happy with. We can also start experimenting with different data collection methods to get at costs. My unit has decided to request a data collection plan that includes cost data when funding future impact evaluations and to use the next set of funded evaluations as test cases for learning about how to collect this kind of data. Any other ideas or examples of cost data - both the success stories and the failures – are of course most welcome.