Now, before I get to the list, we know there are a lot of benefits to impact evaluation. And what also occurred to me as I worked through the list was that some of these things can benefit the program, even if they are a slight distortion of resources. And, as a third consideration, it's also important to keep in mind that different methods incur different costs (matching techniques, for example, seem to skip a chunk of these costs -- but they probably make it up in the extra surveys you will have to do to get a sample you can match from). But let's get to the list:
1. A big potential cost is excess recruitment. This really binds when there is a set of program eligibility criteria. For a bunch of methods, we will need to collect data on these criteria for both the treatment and the control group pre-program. Collecting the data on the control group is where the cost comes in. And the more work there is to collect these data, the bigger this cost will be. One of the examples being talked about this workshop was a psychological therapy program. Folks are screened for this (which obviously makes sense) -- but doing this screening for both the treatment and the control was not trivial. Now, if you are doing randomized phase in, the bonus of this is that these folks will all be pre-identified when phase 2 of treatment comes around. But, other than that, this is a cost that comes with the evaluation.
2. Holding back the enthusiasm of program staff. In one program that I worked with, the field staff were so eager to get going that they started running out to communities before treatment and control were assigned. These guys were particularly gung-ho, but the general point here is that in cases where you are doing an RCT, with assignment based on data that is being gathered, there is likely to be something of a program delay as the data is entered for the assignment to treatment and control to be made. Now, the good news is that new computer assisted data collection tools skip the data entry step, so they should help reduce or eliminate this particular cost. But, if you are pulling your data from excess recruitment that took place through the project's infrastructure, this might be less likely to be computerized.
3. Collecting more data through the project's monitoring system. I've worked on evaluations where the fact that we were doing an impact evaluation added some variables to the monitoring indicators the project was tracking. These can range from more frequent or detailed measures of people who enroll in the program to detailed attendance measures (name, id number, and present/absent for each session). Now given that these focus on steps in the causal chain that underpins the program, this strikes me as sometimes being a less obvious "cost" -- knowing these things in the absence of an impact evaluation might have been useful for the program. And the extra supervision, support, and scrutiny that the evaluation team brings to the monitoring system may (in some cases) improve the overall quality of the system.
4. Multiple arms. The logistical work to make sure that multiple treatment arms are rolled out to the right beneficiaries clearly presents a cost to the program. And one thing to think about is that this cost can end up biting at a critical time: as a program is trying to get activities off the ground, the extra work to make sure that these folks get variant A, those folks get variant B, and the folks over there get variant C may really strain logistical capacity that needs to be making sure anything works.
5. Stretching out the program with sample size. Take the case of a program that was going to target everyone in a defined neighborhood. Now the evaluator rolls in and everyone agrees the need for a control group. And the idea here is to half the folks in the original neighborhood and half in the next one over. Clearly, logistical costs have gone up. Another variant of this comes from randomized phase-in. In one program I worked with, we were evaluating the program in two large districts, a significant distance apart. The program folks treated half of the sub-districts in one catchment area, then half in the other. Then, they had to go back to the first district, and treat the control group and then, on to the second for the same. Clearly, it would have been cheaper for them to hit each contiguous sub-district as they went. But, in this case, we discussed the costs and agreed that it was worth it for the lessons we would get from the evaluation.
6. Distorting the program effort towards the component being evaluated. This issue also came up in the workshop, where one of the program management folks felt that a disproportionate amount of program supervision and other attention was going towards the one (relatively small) component that was the subject of the evaluation, while other, larger, interventions were getting less attention than they might in the absence of the evaluation. This can come from the incentives program folks have to get good results or, as I have seen in practice, from a fascination with the evaluation process and the ability to learn. Either way, this represents a potential distortion (more than a straight cost) for the program and raises the question about what you are evaluating (I'll skip the Heisenberg references on this one).
These are some of the costs that are likely to arise for the programs we work with. Obviously, they are things that will hopefully come up in discussions about the design of an evaluation and thus avoid more difficult discussions later on. Further thoughts on other costs, as well as mitigation measures are most welcome. Finally, one other thought to ruminate on – given that there are likely to be some positive costs to the program from participating in an impact evaluation, this is one driver of the selection bias into what gets impact-evaluated…