The production process for an impact evaluation is strikingly more complicated. It involves all of the same tasks of formulating hypotheses, estimating models, and writing up research as standard papers, but also a whole host of additional tasks ranging from designing interventions, designing questionnaires, implementing random assignment, carrying out data collection and data entry, to mundane tasks like paying survey staff and filling out paperwork for funders.
How should we model this process?
I think there are (at least) two competing theories of how we should think of this production process, with very different implications for how researchers should allocate their time.
The first theory I have in mind is the O-ring theory  of Michael Kremer. In this theory production consists of many tasks, all of which must be successfully completed for the product to have full value. He assumes that it is not possible to substitute several low-skill workers for one high-skill worker, and a consequence is assortative matching, whereby rather than supervising more employees, high-skill agents are matched with high-skill coworkers.
So in this model, the idea is that one tiny screw-up in one part of an impact evaluation can screw up the whole project – a skip pattern goes wrong and you lose a key variable, the randomization gets incorrectly implemented and all that follows is invalidated, incorrect paperwork gets filled out and grant funding gets suspended, etc. The implication is that researchers should be highly involved in every step of their impact evaluation, with collaboration taking the form of teams of high quality researchers, but little delegation of tasks outside this core team.
A competing theory is the Knowledge Hierarchy  idea of Luis Garicano and co-authors. Their idea is that efficient production in society requires “conserving the superior knowledge of experts for the right questions, that is, effectively matching questions and expertise”. The solution is a knowledge-based hierarchy in which experts leverage their knowledge by having less expensive workers deal with routine tasks, and only if they face a problem that is too hard relative to their knowledge level does this problem get passed up to experts above them. They acknowledge that information asymmetries complicate this procedure, but show that efficient outcomes can be possible with appropriate compensation schemes in which the expert is the residual claimant on production.
So here the idea is that it is clearly not efficient for researchers to be spending their time dealing with all the mundane tasks of an impact evaluation (like deciding which venues to hold training in, or filling out expenditure reports, or formatting questionnaires, etc.), and their time is better spent focusing on the parts of the evaluation where they can best use their skills – like designing the experiment or evaluation method, thinking about theories to test, and designing testing strategies. A lot of other tasks then get delegated to others – so this is the model where a field assistant is responsible for a lot of day to day logistics, or a professional survey organization is in charge of collecting the data, etc.
So which is it?
I’m sympathetic to both views, but perhaps lean more towards the knowledge hierarchy view, with the distinction being a few key steps that may be the most O-ringy, and where I want to be closely involved regardless of whether this is difficult or not. For example, I like to do the random assignment myself wherever possible when doing an experiment. But I see a number of different trade-offs and factors that make me think we are all still figuring out the right technology (and that the right technology is likely to vary by person and project):
- I’m struck by how many emails I get about mundane fieldwork or budget issues and how hard it is to avoid dealing with some of them – this is a common lament of many researchers, who seem to be spending way too much time doing things that are clearly not what they spent years training as an economist to do. The amount of time I once spent trying to figure out how to get 10 iPads to be prizes for firms participating in a survey would easily have exceeded several times the cost of said iPads.
- Budget is obviously one factor – graduate students have lots of time and little funding typically, so end up spending a lot of time doing stuff in the field themselves – more senior researchers have many more competing demands on their time, and hopefully better access to funding, so are more able to hire people.
- Some countries have better infrastructure for outsourcing tasks than others – I’ve been lucky to find very high quality professional survey companies in many of the countries I’ve worked in, who are clearly way better at survey logistics than I am. But I’ve also had experiences in a couple of places where contracting the surveys to a survey company has been much more problematic.
- I do worry about principal agent issues, and the fact that no one cares about your impact evaluation as much as you do. The modal way of handling this seems to be to hire smart young people who want to go onto graduate school, where incentives are aligned for them to work hard, learn lots, and get a good recommendation letter from you. It can be harder when you need to rely on someone who is just doing this as a job.
- I think researchers often think that being good at research makes them good at other tasks, which need not always be the case – for example, I’ve seen researchers putting together publicity materials to try to encourage people to participate in a program, when it is not clear we are better at doing that than organizations who do it as part of their main job. But on the other hand, sometimes there just seem to be so many ways that things could be done better than the status quo…
- It is not clear what the uber-production function for research is that we are trying to maximize. The economics profession rewards one research paper in a top 5 journal more than say 5 good publications in journals outside this narrow set – so perhaps this pushes for absolute rather than comparative advantage to be used by academics. But it is less clear this is optimal for society.
- impact evaluation in practice