Notes from the field: The danger of programs that pay for performance


This page in:

I was recently working with an implementing agency to design an impact evaluation and we were having trouble reaching a point where there was going to be a viable impact evaluation that answered big questions about the efficacy of the intervention.   Looking back, part of the problem was that this agency was the implementer, not the funder.   And they were paid by the funder based on reaching a certain number of people and having those people participate in the program.
From a funders point of view this makes a lot of sense.    For a long time, a lot of aid was measured/judged based on inputs:  x dollars disbursed, y school books purchased, and the like.        A number of donors have started to move away from this -- for example, the World Bank has a lending instrument which disburses against results as implemented by the recipient country.   In the grant-based aid realm, other donors directly contract implementing partners (sometimes for-profit firms, sometimes NGOs) to run things.   And they write these contracts (as a smart person would) based on actually delivering a program and getting people to participate.   Finally, a number of governments write contracts like this with their own ministries (or private providers) and then hold the relevant officials to account when performance targets aren't met. 
Now, as you've probably picked up by now, most of these programs' definition of results or performance stop with outcomes (i.e. we offered a program, people used it). They don't use impacts because impacts may take awhile to manifest and, of course, because a zillion things are confounding the impacts.  This makes sense from a contracting point of view.

The problems arise if we are trying to learn something, in a rigorous (i.e. impact evaluation) fashion.   The first problem is that this, in some cases, can discourage innovation and risk taking in programs.   If my contract says that I have to reach 10,000 firms with a certain set of information and I have a way to reach the firms that has worked in the past, I'll go with that, even if there are some innovative but unproven potential cheaper ways.
The second problem is that if the contract pays the implementer for "performance" then the deck is stacked against a rigorous evaluation.  Impact evaluations are rarely costless for the implementer; things such as needing to recruit or identify a control group, implementation delays due to survey rounds, and the like can add to the cost of program implementation (I discussed these costs in more detail in an earlier post).   And if the implementer has to reach a target they would prefer to put all of their cash towards that.   Or, if they're a for-profit firm, they definitely won't want to cut into those profits to produce the public good that is impact evaluation.
The obvious thing to do is to put the impact evaluation in the contract.   That is, even if the evaluation will be done by a third party, specify that the implementer has to cooperate with the evaluation (and bear the costs).  This isn't as easy as it sounds, not least because the parameters of the impact evaluation are often going to be realized after the preparatory implementation work has been done (which might be too late to alter the parameters of the implementation contract).   Despite this, I have seen some of the folks who manage these implementation contracts do this effectively.     But a) they are the early adopters, the innovators, and b) they are a minority.    As this mode of programming becomes more common, my concern is that we will increasingly neglect to include this.   And, as I found out during this recent trip, trying to stick an evaluation in a contract that is already written is woefully hard, if not impossible.
It would be good to hear from others on this -- have you had experiences with this?   Any advice on designing the program implementation contract?    Other thoughts?


Markus Goldstein

Lead Economist, Africa Gender Innovation Lab and Chief Economists Office

O Knight
September 10, 2014

Interesting post, thanks. Surely one solution to the problem you describe is to take the impact evaluation out of the contract and award this separately to a third party? More overheads yes, but more likely to be impartial. Third party monitoring/verification is pretty standard for RBF in any case, so it may be that a second contract already exists.

Markus Goldstein
September 15, 2014

Sure, this is definitely one way to go.   However, what concerns me is that the impact evaluation could entail additional costs (e.g. recruiting a control group) and it is hard to specify this set of costs, as well as the critical need to collaborate with the evaulator, in the implementer contract.

Steve Glazerman
September 10, 2014

This is oh-so-familiar. Thanks for writing about it. Happens over and over that implementation goals (serve as many beneficiaries as possible) conflict with evaluation goals (recruit as many *study* participants as possible and assign many to control group).
The imperfect solution is to write implementers' performance contracts in a way that gives them credit for control group members even if they are not beneficiaries of the program. It's imperfect because this will be counter-intuitive to donors and aid agencies. Why pay you to not serve someone? But the recruitment costs are real and the information they provide in helping us estimate counterfactual has a huge benefit to the field.
There will always be a tension between implementation and evaluation. Key is to get buy-in from funders early on, especially if implementation funder is different from the evaluation funder.
Evaluation should compensate the implementer for burden imposed. Implementer should have a contract requirement (with teeth) of cooperating with evaluator.

Markus Goldstein
September 15, 2014

I totally agree Steve, thanks for the thoughts. 

Tania Alfonso
September 11, 2014

What was the time horizon? If you have to reach 10,000 firms this year, then I can see the hesitation about taking risks, but if you have several years to try different things, then finding a cheaper way to achieve the same outcome (and still get paid the same for the results) would be quite profitable.
I say this with the full awareness that most donor contracts are a couple of years in length, not five or ten years.
Finally - a donor could just pay for higher level outcomes, without getting too concerned about attribution. If you have kids vaccinated or graduate from primary school and are willing to pay for that, it leaves the task of figuring out how to do it in the most efficient way possible up to the implementer. Again, given a long enough time horizon to experiment.

Markus Goldstein
September 15, 2014

Excellent point -- thanks Tania -- the time horizon is key, particularly if we are going to write contracts against the impacts/higher level outcomes that the impact evaluation will measure.  And to measure these impacts in an attributable way, then the impact evaluation would have to be built in.   Which then raises a sharp question on how to maintain the integrity of the impact evaluation.    And even if you go with a third party contract to maintain the impartiality of the evaluation, there will likely be issues on the precision of the final estimates and which set you want to use my friend Dan Gilligan was pointing out to me today. 

Michael Eddy
September 12, 2014

Thanks for the post. You rightly lay out the value of paying for longer term outcome and impacts that give the implementer autonomy to innovation. Unfortunately as you note much of pay for performance is still too focused on outputs. In areas where there is no known technology to achieve impacts, or that technology is highly context dependent (low external validity), paying for impacts can be a great discovery mechanism to move governments and donors from planners to searchers.
The impact evaluation problem you note is important and a valuable lesson, however it doesn't seem like it necessarily is an intrinsic problem to pay for performance. When done well and with enough forethought we can combine the best of pay for performance with impact evaluation.
Finally it's worth comparing the alternative to measuring results. Governments spend significant amounts of money doing costly and time consuming audits which really say very little about the actual results achieved. Measuring results rather than receipts doesn't have to be as costly as many believe as long as we can move governments and donors out of the realm of doing costly audits.

Markus Goldstein
September 15, 2014

Many thanks for the thoughtful comments.   I would agree entirely that with enough forethought we can combine the best of pay for performance with impact evaluation -- and it's the forethought I am trying to argue for.   As it stands now, with the increase in pay for performance, and a lack of forethought, my concern is that an impact evaluation is harder to fit into the program post-contract than if the program were not pay for performance.    But forethought would fix this. 

March 12, 2015

I totally agree with this post. Reviewing impact evaluation studies in the agriculture sector for my PhD thesis, I was surprised by the modest number of rigorous evaluations that can be found in the literature. Within this context, I would advance the following point for further discussion. My question is whether funders are really interested in measuring their activity's efficacy on impact, rather than on output terms. Make a program change the benchmark economic or social conditions requires sound theories of change. The intervention logic of a program often is ofter based on general beliefs, rather than empirical evidence. Consider as well that even for programs involving large cash transfers (such as the structural funds of the European Union) there is no robust evidence of their positive net effects. Summing up, I guess if the focus on output is a way to hidden programs' inefficacy.
Eager to hear other comments.