Using Case Studies to Explore and Explain Complex Interventions


This page in:

One of the most cited of Martin Ravallion’s many papers implores researchers to “look beyond averages” if they want to better understand development processes. One fruitful area in which this might happen is the assessment of complex interventions, a defining characteristic of which is that they generate wide variation in outcomes. Even when controlling for standard factors that might account for this variation – income, education levels, proximity to transport – outcomes in education, the performance of health care systems, or the extent to which villages participating in community-driven development programs manage conflict, can nonetheless vary considerably: some will be outstanding, some will muddle through, and some will fail spectacularly. In such situations, an important initial task for evaluators is determining the nature and extent of that variation: it will inform the overall assessment of whether the intervention is deemed to ‘work’ (or not).

An important secondary task when assessing complex interventions, however, is explaining and learning from this variation. What makes a complex system truly ‘complex’ is something some social scientists – extending a term developed by physicists and neuro-scientists – call its causal density, or the number of independent discretionary interactions connecting inputs and outcomes. By this definition, families are decidedly complex whereas smart phones are merely complicated. By extension, raising (lowering) interest rates by a quarter of a percentage point yields predictable decreases (increases) in inflation, whereas after 18 billion experiments in how to raise children (i.e., roughly the total number of homo sapiens that have ever lived) each new parent remains highly unsure about how to carry out this most primal of tasks: there is no certainty, only a variable probability, that following key principles (themselves often culturally and historically contingent) will yield the desired outcome.

Development interventions in fields such as governance and justice are complex in this sense, as are key aspects of public services such as education (classroom teaching) and health (curative care). While we should expect high variation in the effectiveness of such interventions, even when they’re carefully designed and faithfully implemented, explaining how and why specific forms of variation occur is both necessary and difficult. It is necessary because this variation can itself be a valuable source of intra-project learning; discerning where and for whom a particular intervention is working (or not) can be a basis on which mid-course corrections are made (e.g., with regard to resource allocation and personnel deployment). Moreover, a second defining feature of complex interventions is that it is virtually impossible to specify or anticipate ex ante the precise problems and solutions that are likely to emerge during implementation; as such, having high quality monitoring systems in place can help to identify problems before they get too severe, and locate specific examples of where and how such problems have been solved.

Explaining variation is difficult, however, because it is likely to remain even after controlling for observable factors. Household survey instruments are crucial for mapping the nature and extent of variation, and for enabling observable factors to be incorporated into explanations of it. But a third defining feature of complex interventions is that they are highly susceptible to the influence of statistically ‘unobservable’ factors, such as those stemming from social networks, motivation, legitimacy, expectations and power. Lots of devils lurk in lots of details, and to tease these out, different methodological tools are needed. If the great strength of statistical methods is ‘breadth’, then the complementary strength of qualitative approaches is ‘depth’. Conducting some “deep dives” in exceptional places helps researchers to observe unobservables, to get a more precise read on the specific micro mechanisms by which prevailing inputs manifest themselves as particular outcomes.

An example of this approach in action can be seen in a recent study a team of us undertook on the quality of service delivery in the Middle East and North Africa region. Analyses of the existing household data sets in MENA, however, revealed enormous variation in how well these policies were being implemented; absenteeism in health clinics in Yemen ranged from 8 to 83%, for example, while schools in Palestine, though generally struggling, nonetheless sometimes yielded performance scores on standardized international tests (TIMSS) that were ‘average’ by global standards – an extraordinary accomplishment in a veritable war zone. But what accounts for this variation? It can’t be policy variation, since policies are effectively ‘constant’ in centralized states, and the household data could only tell us so much (since it typically contained only rather crude information on local structural variables). Conducting case studies in some of these exceptional places – in Palestine (education) and Jordan (health) – helped unpack the causal mechanisms by which, in these contexts, extant policies were being transformed into superior outcomes; in other words, it helped us understand the ‘causes of effects’ (as a counterpart to orthodox evaluation techniques that focus on discerning the ‘effects of causes’). Put differently, it helped discern how certain teams in certain communities were able to take the same policy ‘ingredients’ as everyone else but somehow bake a much better cake.

These findings may or may not be generalizable, but they unpack how complex interventions can work, showcase practical examples of how others working in the same contexts might seek improvement, and help shape a broader policy dialogue that is simultaneously honest with regard to the overall effectiveness of a given delivery system (which is often parlous) yet not without hope: someone, somewhere, somehow has figured out how to do a better job than others. Their insights can be a basis on which improvements elsewhere are sought.


Michael Woolcock

Lead Social Scientist, Development Research Group, World Bank

rick davies
July 20, 2016

Re "Conducting case studies in some of these exceptional places – in Palestine (education) and Jordan (health) – helped unpack the causal mechanisms by which, in these contexts, extant policies were being transformed into superior outcome"
To me this quote highlights the need to put more cross-case analytic work into studying the many situations where things _don't_work_ and then identify the False Positives, i.e. the "positive deviants" which can then be the focus of case studies to identify the causal mechanisms at work that might have the potential to be replicated. Speaking more generally, case studies should emerge/follow on from case selection strategies.

July 21, 2016

I agree with Rick's comment that you also need to also look at places where things don't work well. Otherwise you may find that what you think is the crucial path in the route to success could also be a route to failure. In other words you need to compare causal mechanisms for both successful and unsuccessful projects.

Michael Woolcock
July 25, 2016

Thanks Rick, and Fiona. Yes, indeed, to both comments. "Off diagonal" cases -- or "negative" as well as "positive" deviance -- is needed to get a better sense of the process mechanisms at work. We tried to do that in the MENA study I cite, though we could only use secondary sources (since, as you might imagine, it's hard to get officials to let anyone see their worst schools and clinics, etc.). Either way, the more general principle is asking 'Of what is this a case?' to any proclaimed instance of a phenomena. The answer requires knowledge of the nature and extent of the broader distribution.

July 28, 2016

Thanks for a nice post, Michael, expanding on your previous paper on external validity and hipply updating the example of a complicated intervention from a wristwatch to a smart phone.
I continue to think that 'complex' isn't such a helpful construct but your overall point (learn more about why & how stuff works in particular places by exploring causes of effects) and your example is useful in making the point tangible.
I'd like to further push a point that you & Rick (in comments) raise because it just doesn't seem to be sinking in for some. Case studies are almost always better when purposively selected. Gerring, Lieberman, QuinnPatton & several others have written usefully on this but I'll reiterate a bit.
The following happens too much (as a caricature): 'we didn't understand our impact evaluation results, so we randomly selected some villages to ask a focus group composed of randomly selected people to ask "what gives?"'
Kudos to people doing deeper dives (less kudos to over-reliance on focus groups) but pick cases that will really support more comprehensive learning about how a program & the world interact.
To follow Michael's advice, undo this idea: random = better. And don't replace it with: convenience = fine.
Where and with whom can you learn about the 'median impact narrative' (to borrow from Wydick)? Good, go there. Where are there high scores on the dependent variable? Super, plot the route to those villages/forms/countries. Low performers? Yup, seek them out intentionally, even if they are harder to find. A similar approach can be taken for variation on theoretically meaningful (meaning playing a hypothesized-to-be-important a role in the theory of change) observable independent vars, to inform deeper dives amongst those likely to have instructive variation on important unobserved (oops!) or unobservable variables. As you note, variations in implementation progress or quality can also be useful in pinpointing where small n (quant &/or qual) deep dives can facilitate more learning -- & monitoring data can ideally guide selection to look at heterogeneous cases.