Published on Development Impact

Is “impact evaluation” the right term?

This page in:

Today’s midweek post on the Development Impact blog is a bit lighter fare than the previous two meaty posts on external validity issues from David and Berk. It concerns the very term that is the focus of this blog: impact evaluation.

Yesterday Berk, I, and other research colleagues attended a day long discussion with operational colleagues in the Africa region on the role of impact evaluation activities under World Bank projects. Many challenges of how to “mainstream” impact evaluation into lending projects were discussed and debated. These challenges include the identification of funding sources, the proper forums for dissemination, and, equally important, the confusion in the wider development community over what is mean by “impact evaluation”.

One colleague relayed “every time I hear the term impact evaluation I think of ‘car accident’“. On reflection, impact evaluation is certainly not an elegant term. To a layperson the term may connote the activity that an insurance company must undertake in the aftermath of a hurricane or other widespread disaster. From the perspective of certain other disciplines, it strikes practitioners as ill-defined and vague.

At a discussion table, several health colleagues conveyed various definitions of impact evaluation. These included the view that impact evaluation focuses solely on ultimate outcomes such as, in the case of health, mortality (and studies that focus on process indicators such as drug availability should not be considered impact evaluation). Apparently many counterparts in government share this view, as well as the view that impact evaluation is necessarily an expensive and lengthy process because it exclusively focuses on long-term impacts. Another common definition is that impact evaluation studies are necessarily randomized trials.

These varied understandings often lead to real initial problems when working across disciplines and when introducing the possibility of impact evaluation into project design discussions with government counterparts. Sometimes an initial misunderstand will color a whole series of discussions and may ultimately affect the decision of whether to go forward with a study.

For the community of researchers in which I engage, impact evaluation is a catch-all term for any research that seeks to attribute the causal effect that a program, policy, or intervention has on outcomes or indicators of interest.  It is the emphasis on rigorous causal attribution in an empirical setting that separates the activities that collectively fall under impact evaluation from other research or monitoring activities.

While clearly not universal, this definition is fairly standard among applied social science researchers: it’s echoed in this recent book and by the 3ie for example.

The choice of outcome of interest is driven by the question at hand and can involve ultimate long-term outcomes or simple short-term process indicators. The method of evaluation can be randomization, but can also exploit discontinuities, etc. The time frame can span many years or a matter of days. The study setting can be highly controlled as in an efficacy trial or fully immersed in the messy world of at-scale programs, as in effectiveness trials. The particular question under study, and the setting in which it takes place, will determine all of these factors.

My colleagues at the table suggested other applicable terms that may better describe the same activities under discussion: operational research, evaluative research, and implementation science. Here’s my take on each:

Operational research – I see the appeal of this term because the IE work under bank projects often involves understanding the impact of alternative modes of program implementation – a very important type of program effectiveness research. Operational research with rigorous causal attribution is a key subset of impact evaluation activities. However I don’t believe this term encapsulates the full scope of IE activities, such as research under highly controlled settings that seeks to elucidate important aspects of human behavior. One example of this type of research is a recent paper by Cohen and Dupas that estimates a demand curve for malaria bed nets.

Evaluative research – This isn’t a bad term. It is sufficiently vague and still conveys the activity of evaluation. But I suppose it missed the bus in that “impact evaluation” has much more widespread usage today.

Implementation science – Similar to operational research, it has a hands-on appeal for many of the effectiveness-types studies we do. However as argued above, impact evaluation is a wide tent that encompasses studies that focus on program implementation, but other study topics as well.

After this table discussion, we took a poll of the preferred term to describe the research activities that were under discussion throughout the day. The final tally:

Operational research: 2 votes

Evaluative research:  2 votes

Implementation science: 3 votes

Impact evaluation: 2 votes

The jury did not reach a verdict, deliberations must continue.


Jed Friedman

Lead Economist, Development Research Group, World Bank

Join the Conversation

The content of this field is kept private and will not be shown publicly
Remaining characters: 1000