Published on Development Impact

A proposed taxonomy of behavioral responses to evaluation

This page in:

My summary of recent attempts to quantify the Hawthorne effect a few weeks back led to some useful exchanges with colleagues and commenters who pointed me to further work I hadn’t yet read. It turns out that, historically, there has been a great deal of inconsistent use of the term “Hawthorne effect”. The term has referred not only to (a) behavioral responses to a subject’s knowledge of being observed – the definition we tend to use in impact evaluation – but also to refer to (b) behavioral responses to simple participation in a study, or even (c) a subject’s wish to alter behavior in order to please the experimenter. Of course all these definitions are loosely related, but it is important to be conceptually clear in our use of the term since there are several distinct inferential challenges to impact evaluation arising from the messy nature of behavioral responses to research. The Hawthorne effect is only one of these possible challenges. Let me lay out a classification of different behavioral responses that, if and when they occur, may threaten the validity of any evaluation (with a strong emphasis on may).

First an aside – it’s interesting to note that the original Hawthorne studies conducted in the Hawthorne plant outside Chicago don’t hold up under scrutiny. It turns out there is little in the original data that points to an actual Hawthorne effect – the series of studies conducted contained many uncontrolled variables rendering identification of an effect impossible. Levitt and List re-analyze the first round of experiments and conclude “Perhaps the most important lesson from the original Hawthorne experiments is the power of a good story. The mythology surrounding the Hawthorne experiments arose largely absent careful data analysis…” Nevertheless Levitt and List concede the general possibility of a Hawthorne effect even if it wasn’t present in the actual Hawthorne studies.

Other researchers go even further. One 2006 paper is entitled “ The Hawthorne effect is a myth, but what keeps the story going?A second paper argues that the term ‘Hawthorne effect’ should be avoided:  “Instead… researchers should introduce specific psychological and social variables that may have affected the outcome under study but were not monitored during the project, along with the possible effect on the observed results.” A clear problem with this suggestion, though, is that we don’t know exactly why and in which contexts the Hawthorne effect occurs, so it would be difficult to account for it through study observables. More importantly, even if we knew exactly why the Hawthorne effect arises in a given context, we may be unable to avoid it if the only available method to assess the targeted behavior is concomitant with the participants’ knowledge of observation.

Regardless of whether there was an actual Hawthorne effect at the Hawthorne plant over the various study rounds of 1924-1933, there have been numerous subsequent studies that identify a subject modifying behavior (in the real world, not only in the lab) as a result of observation. I listed in my last post 12 such studies that attempt to quantify Hawthorne effects in health research. And my list wasn’t meant to be exhaustive.

OK so at times it appears the Hawthorne effect is real. With that said, this effect is not the only possible behavioral response to evaluation. Let me now distinguish among the following behavioral possibilities:

  • A subject’s response to the knowledge of participation in an evaluation treatment – this is conceptually distinct from the Hawthorne effect because it differentially affects treated and control units even if both groups are under equivalent levels of observation. It’s certainly a theoretical possibility that the awareness of participating in an experimental intervention can affect behavior. Potential channels for this effect include excitement over a change from routine activities, or a desire to please the evaluators. Let me term this effect the “Spotlight” effect.
  • On the other hand, if behavior is influenced by the knowledge of being left out of an evaluation treatment – perhaps due to demotivation from awareness of exclusion – then this is termed the John Henry effect. It’s only applicable to control groups.
  • Let’s reserve the Hawthorne effect to refer to a subject’s response to the knowledge of being under observation. Typically for this effect to arise, I imagine, there would need to be a perceived return to behaving in a certain manner.

Since all of these effects are intentional responses to either the knowledge of participation or the knowledge of observation, these effects are driven by the meaning a subject attaches to (a) the fact of being under observation and/or (b) participation in a social experiment. As I’ve written about in the past when discussing the “meaning response” as an explanation of the placebo effect, social experiments need to consider how study subjects will interpret study participation and the fact of being under observation and whether any unintended meaning may be attached to either activity. Levitt and List discuss something very similar when they note how scrutiny (observation) influences the degree of exhibited pro-social behaviors in laboratory experiments.
  • Finally, while the Hawthorne effect is an intentional response to being placed under observation, let’s consider a separate non-intentional response to observation as, at least, a theoretical possibility. This type of effect, I term it the “Suggestion” effect, has been noted in a study of health insurance where take-up of insurance for surveyed subjects months after interview was far higher than for subjects monitored through administrative data but not interviewed.

Here’s a table delineating these four effects as well as suggesting possible mechanisms that explain them. I also sketch out probably the only means of definitively identifying each of the effects:


Note that in this taxonomy I only consider effects that may influence actual behavior, and not effects that influence the veracity of reported information (as distinct from actual behavior). Methods to minimize what is termed social desirability bias in survey response have been previously discussed in the blog here and here.

So how much should we worry about these effects anyway? Of course its difficult to say as the evidence is generally slim and context matters a great deal – none of these effects can arise if study subjects (a) don’t know they are participating in an evaluation, and (b) unaware of being under observation. Quite a few impact evaluations – usually large scale policy evaluations relying mainly on administrative data – satisfy these two conditions.

The Hawthorne effect probably has the largest number of studies devoted to measuring it, but even here the number of studies is not overwhelming and the study contexts not incredibly diverse. It is somewhat reassuring that, based on evidence from observations of teachers and health providers, that the Hawthorne effects appears to wear off rather quickly as the subject habituates to observation.

I know of no study that tries to identify and measure the “Spotlight” effect through the suggested means of identification in the table above. David recently reviewed the evidence for John Henry effects and concludes the evidence is thin at best. The only recent paper I know that investigates the “Suggestion” effect is cited above.

Are “Spotlight”, John Henry, and “Suggestion” effects truly rare? Which, if any, contexts are likely to engender these effects? Or maybe these effects are more pervasive than we suspect but not yet been identified? I don’t know.

Please tell us know of any relevant work not mentioned. Perhaps the blog should consider managing an open list of relevant research.


Jed Friedman

Lead Economist, Development Research Group, World Bank

Join the Conversation

The content of this field is kept private and will not be shown publicly
Remaining characters: 1000