The “science of delivery”, a relatively new term among development practitioners, refers to the focused study of the processes, contexts, and general determinants of the delivery of public services and goods. Or to paraphrase my colleague Adam Wagstaff, the term represents a broadening of inquiry towards an understanding of the “how to deliver” and not simply a focus on the “what to deliver”.
Impact evaluations, which in principle can help answer either the “what” or “how” question, are typically focused on the “what to deliver” question, and by that I mean the identification of the causal effect of intervention X on outcome Y in context Z – this basic formulation is the why and wherefore of impact evaluation. Of course answering questions of this nature can be highly illuminating, especially when the intervention is previously untested but promising. Or when the identifying exogenous variation sheds light on some fundamental of human behavior.
But I am far from the only one to notice the somewhat limited generalizable policy implications of many standalone IEs – after all several key mediators in the context Z above are not observed, and perhaps some key mediators have yet to be imagined by the researcher. These thoughts easily segue into the wider dialogue around external validity that we on this blog have delved into again and again (such as Berk’s post earlier this week). But I want to focus here on the learning opportunities evaluative research offers the “science of delivery”, even when the evaluation itself is focused on the “what” and not the “how”.
Earlier I blogged about a paper by Angris, Pathak, and Walters that measured the effectiveness of charter schools in Massachusetts. The aspect of that paper that most excited me was the exploration of correlates of treatment heterogeneity. It turned out that one particular school model – one with a focus on increased instruction time and the creation of high student expectations – explains much of the school level variation in effectiveness. Of course this systematic exploration of observed correlates of treatment heterogeneity can be hit and miss, largely driven by what happens to have been collected in the data. Yet policy insights can emerge from these exercises.
Now another good example of this approach, one that looks at the effects of regional transfers within the European Union, was just published in the journal AEJ Policy. This paper, by Sascha Becker, Peter Egger and Maximilian von Ehrlich, identifies the impact of the EU’s main regional transfer program – one intended to provide transfers to the poorest regions of the EU in order to foster catch-up growth – on regional net investment and growth. (In this context, regions are contiguous sub-national units containing 1 – 3 million population).
By policy rule, only regions with GDP per capita below 75 percent of the EU average are eligible for these EU transfers. This GDP per capita discontinuity in regional eligibility enables a Regression Discontinuity Design, but the authors extend this standard method to explicitly model covariate heterogeneity in impact (and they do this for both parametric and non-parametric RDD estimators – the two approaches yield consistent findings).
The authors focus on one key determining factor of transfer performance – the capacity of the local government system to absorb the transfer and channel the funds into productive investments. They proxy absorptive capacity by both the mean educational attainment of the adult population and by measures on the quality of local governance.
It turns out that not only is there an identifiable yet modest effect of transfers on subsequent investment and growth, but there is substantial heterogeneity in the effect of transfers. And this heterogeneity is systematically related to the absorptive capacity measures:
So here is a good example of impact evaluation not only helping to determine the “what to deliver”, but also shedding additional light on the “how” and in a fairly broad context. Of course there is still much in the context that is unobserved and potentially crucial, but at least we know more than we did before. In general, impact estimates of large scale programs will almost assuredly encompass some degree of treatment heterogeneity, and I hope we will see a lot more work in the near future that tries to tie this heterogeneity to pre-existing conditions that mediate program performance.
In this regard, impact evaluation will be a useful tool that not only can formally test between different modes of service delivery – helping to answer the “how” – but can also enhance our understanding of the contextual factors that determine when and where we should supply the “what”.
Impact evaluations, which in principle can help answer either the “what” or “how” question, are typically focused on the “what to deliver” question, and by that I mean the identification of the causal effect of intervention X on outcome Y in context Z – this basic formulation is the why and wherefore of impact evaluation. Of course answering questions of this nature can be highly illuminating, especially when the intervention is previously untested but promising. Or when the identifying exogenous variation sheds light on some fundamental of human behavior.
But I am far from the only one to notice the somewhat limited generalizable policy implications of many standalone IEs – after all several key mediators in the context Z above are not observed, and perhaps some key mediators have yet to be imagined by the researcher. These thoughts easily segue into the wider dialogue around external validity that we on this blog have delved into again and again (such as Berk’s post earlier this week). But I want to focus here on the learning opportunities evaluative research offers the “science of delivery”, even when the evaluation itself is focused on the “what” and not the “how”.
Earlier I blogged about a paper by Angris, Pathak, and Walters that measured the effectiveness of charter schools in Massachusetts. The aspect of that paper that most excited me was the exploration of correlates of treatment heterogeneity. It turned out that one particular school model – one with a focus on increased instruction time and the creation of high student expectations – explains much of the school level variation in effectiveness. Of course this systematic exploration of observed correlates of treatment heterogeneity can be hit and miss, largely driven by what happens to have been collected in the data. Yet policy insights can emerge from these exercises.
Now another good example of this approach, one that looks at the effects of regional transfers within the European Union, was just published in the journal AEJ Policy. This paper, by Sascha Becker, Peter Egger and Maximilian von Ehrlich, identifies the impact of the EU’s main regional transfer program – one intended to provide transfers to the poorest regions of the EU in order to foster catch-up growth – on regional net investment and growth. (In this context, regions are contiguous sub-national units containing 1 – 3 million population).
By policy rule, only regions with GDP per capita below 75 percent of the EU average are eligible for these EU transfers. This GDP per capita discontinuity in regional eligibility enables a Regression Discontinuity Design, but the authors extend this standard method to explicitly model covariate heterogeneity in impact (and they do this for both parametric and non-parametric RDD estimators – the two approaches yield consistent findings).
The authors focus on one key determining factor of transfer performance – the capacity of the local government system to absorb the transfer and channel the funds into productive investments. They proxy absorptive capacity by both the mean educational attainment of the adult population and by measures on the quality of local governance.
It turns out that not only is there an identifiable yet modest effect of transfers on subsequent investment and growth, but there is substantial heterogeneity in the effect of transfers. And this heterogeneity is systematically related to the absorptive capacity measures:
- For regions with income around the eligibility cut-off and with above average educational attainment, the transfer programs led to a 1.1 percentage point gain in per capita income growth, while for regions with below average human capital there is no measured effect of the program.
- For regions with above average governance quality, the transfers resulted in a 2.4 percentage point increase in per capita growth while those regions with below average quality of governance saw no gain in growth.
- Similar results are found when investigating the impact of the program on local net investment levels – the transfers don’t appear to crowd out private sector investment in the high capacity regions.
So here is a good example of impact evaluation not only helping to determine the “what to deliver”, but also shedding additional light on the “how” and in a fairly broad context. Of course there is still much in the context that is unobserved and potentially crucial, but at least we know more than we did before. In general, impact estimates of large scale programs will almost assuredly encompass some degree of treatment heterogeneity, and I hope we will see a lot more work in the near future that tries to tie this heterogeneity to pre-existing conditions that mediate program performance.
In this regard, impact evaluation will be a useful tool that not only can formally test between different modes of service delivery – helping to answer the “how” – but can also enhance our understanding of the contextual factors that determine when and where we should supply the “what”.
Join the Conversation