One of the most cited of Martin Ravallion’s many papers implores researchers to “look beyond averages” if they want to better understand development processes. One fruitful area in which this might happen is the assessment of complex interventions, a defining characteristic of which is that they generate wide variation in outcomes. Even when controlling for standard factors that might account for this variation – income, education levels, proximity to transport – outcomes in education, the performance of health care systems, or the extent to which villages participating in community-driven development programs manage conflict, can nonetheless vary considerably: some will be outstanding, some will muddle through, and some will fail spectacularly. In such situations, an important initial task for evaluators is determining the nature and extent of that variation: it will inform the overall assessment of whether the intervention is deemed to ‘work’ (or not).
An important secondary task when assessing complex interventions, however, is explaining and learning from this variation. What makes a complex system truly ‘complex’ is something some social scientists – extending a term developed by physicists and neuro-scientists – call its causal density, or the number of independent discretionary interactions connecting inputs and outcomes. By this definition, families are decidedly complex whereas smart phones are merely complicated. By extension, raising (lowering) interest rates by a quarter of a percentage point yields predictable decreases (increases) in inflation, whereas after 18 billion experiments in how to raise children (i.e., roughly the total number of homo sapiens that have ever lived) each new parent remains highly unsure about how to carry out this most primal of tasks: there is no certainty, only a variable probability, that following key principles (themselves often culturally and historically contingent) will yield the desired outcome.
Development interventions in fields such as governance and justice are complex in this sense, as are key aspects of public services such as education (classroom teaching) and health (curative care). While we should expect high variation in the effectiveness of such interventions, even when they’re carefully designed and faithfully implemented, explaining how and why specific forms of variation occur is both necessary and difficult. It is necessary because this variation can itself be a valuable source of intra-project learning; discerning where and for whom a particular intervention is working (or not) can be a basis on which mid-course corrections are made (e.g., with regard to resource allocation and personnel deployment). Moreover, a second defining feature of complex interventions is that it is virtually impossible to specify or anticipate ex ante the precise problems and solutions that are likely to emerge during implementation; as such, having high quality monitoring systems in place can help to identify problems before they get too severe, and locate specific examples of where and how such problems have been solved.
Explaining variation is difficult, however, because it is likely to remain even after controlling for observable factors. Household survey instruments are crucial for mapping the nature and extent of variation, and for enabling observable factors to be incorporated into explanations of it. But a third defining feature of complex interventions is that they are highly susceptible to the influence of statistically ‘unobservable’ factors, such as those stemming from social networks, motivation, legitimacy, expectations and power. Lots of devils lurk in lots of details, and to tease these out, different methodological tools are needed. If the great strength of statistical methods is ‘breadth’, then the complementary strength of qualitative approaches is ‘depth’. Conducting some “deep dives” in exceptional places helps researchers to observe unobservables, to get a more precise read on the specific micro mechanisms by which prevailing inputs manifest themselves as particular outcomes.
An example of this approach in action can be seen in a recent study a team of us undertook on the quality of service delivery in the Middle East and North Africa region. Analyses of the existing household data sets in MENA, however, revealed enormous variation in how well these policies were being implemented; absenteeism in health clinics in Yemen ranged from 8 to 83%, for example, while schools in Palestine, though generally struggling, nonetheless sometimes yielded performance scores on standardized international tests (TIMSS) that were ‘average’ by global standards – an extraordinary accomplishment in a veritable war zone. But what accounts for this variation? It can’t be policy variation, since policies are effectively ‘constant’ in centralized states, and the household data could only tell us so much (since it typically contained only rather crude information on local structural variables). Conducting case studies in some of these exceptional places – in Palestine (education) and Jordan (health) – helped unpack the causal mechanisms by which, in these contexts, extant policies were being transformed into superior outcomes; in other words, it helped us understand the ‘causes of effects’ (as a counterpart to orthodox evaluation techniques that focus on discerning the ‘effects of causes’). Put differently, it helped discern how certain teams in certain communities were able to take the same policy ‘ingredients’ as everyone else but somehow bake a much better cake.
These findings may or may not be generalizable, but they unpack how complex interventions can work, showcase practical examples of how others working in the same contexts might seek improvement, and help shape a broader policy dialogue that is simultaneously honest with regard to the overall effectiveness of a given delivery system (which is often parlous) yet not without hope: someone, somewhere, somehow has figured out how to do a better job than others. Their insights can be a basis on which improvements elsewhere are sought.
An important secondary task when assessing complex interventions, however, is explaining and learning from this variation. What makes a complex system truly ‘complex’ is something some social scientists – extending a term developed by physicists and neuro-scientists – call its causal density, or the number of independent discretionary interactions connecting inputs and outcomes. By this definition, families are decidedly complex whereas smart phones are merely complicated. By extension, raising (lowering) interest rates by a quarter of a percentage point yields predictable decreases (increases) in inflation, whereas after 18 billion experiments in how to raise children (i.e., roughly the total number of homo sapiens that have ever lived) each new parent remains highly unsure about how to carry out this most primal of tasks: there is no certainty, only a variable probability, that following key principles (themselves often culturally and historically contingent) will yield the desired outcome.
Development interventions in fields such as governance and justice are complex in this sense, as are key aspects of public services such as education (classroom teaching) and health (curative care). While we should expect high variation in the effectiveness of such interventions, even when they’re carefully designed and faithfully implemented, explaining how and why specific forms of variation occur is both necessary and difficult. It is necessary because this variation can itself be a valuable source of intra-project learning; discerning where and for whom a particular intervention is working (or not) can be a basis on which mid-course corrections are made (e.g., with regard to resource allocation and personnel deployment). Moreover, a second defining feature of complex interventions is that it is virtually impossible to specify or anticipate ex ante the precise problems and solutions that are likely to emerge during implementation; as such, having high quality monitoring systems in place can help to identify problems before they get too severe, and locate specific examples of where and how such problems have been solved.
Explaining variation is difficult, however, because it is likely to remain even after controlling for observable factors. Household survey instruments are crucial for mapping the nature and extent of variation, and for enabling observable factors to be incorporated into explanations of it. But a third defining feature of complex interventions is that they are highly susceptible to the influence of statistically ‘unobservable’ factors, such as those stemming from social networks, motivation, legitimacy, expectations and power. Lots of devils lurk in lots of details, and to tease these out, different methodological tools are needed. If the great strength of statistical methods is ‘breadth’, then the complementary strength of qualitative approaches is ‘depth’. Conducting some “deep dives” in exceptional places helps researchers to observe unobservables, to get a more precise read on the specific micro mechanisms by which prevailing inputs manifest themselves as particular outcomes.
An example of this approach in action can be seen in a recent study a team of us undertook on the quality of service delivery in the Middle East and North Africa region. Analyses of the existing household data sets in MENA, however, revealed enormous variation in how well these policies were being implemented; absenteeism in health clinics in Yemen ranged from 8 to 83%, for example, while schools in Palestine, though generally struggling, nonetheless sometimes yielded performance scores on standardized international tests (TIMSS) that were ‘average’ by global standards – an extraordinary accomplishment in a veritable war zone. But what accounts for this variation? It can’t be policy variation, since policies are effectively ‘constant’ in centralized states, and the household data could only tell us so much (since it typically contained only rather crude information on local structural variables). Conducting case studies in some of these exceptional places – in Palestine (education) and Jordan (health) – helped unpack the causal mechanisms by which, in these contexts, extant policies were being transformed into superior outcomes; in other words, it helped us understand the ‘causes of effects’ (as a counterpart to orthodox evaluation techniques that focus on discerning the ‘effects of causes’). Put differently, it helped discern how certain teams in certain communities were able to take the same policy ‘ingredients’ as everyone else but somehow bake a much better cake.
These findings may or may not be generalizable, but they unpack how complex interventions can work, showcase practical examples of how others working in the same contexts might seek improvement, and help shape a broader policy dialogue that is simultaneously honest with regard to the overall effectiveness of a given delivery system (which is often parlous) yet not without hope: someone, somewhere, somehow has figured out how to do a better job than others. Their insights can be a basis on which improvements elsewhere are sought.
Join the Conversation