I have just finished writing up and expanding my recent policy talk on active labor market policies (ALMPs) into a research paper (ungated version) which provides a critical overview of impact evaluations on this topic. While my talk focused more on summarizing a lot of my own work on this topic, for this review paper I looked a lot more into the growing number of randomized experiments evaluating these policies in developing countries. Much of this literature is very new: out of the 24 RCTs I summarize results from in several tables, 16 were published in 2015 or later, and only one before 2011.
I focus on three main types of ALMPs: vocational training programs, wage subsidies, and job search assistance services like screening and matching. I’ll summarize a few findings and implications for evaluations that might be of most interest to our blog readers – the paper then, of course, provides a lot more detail and discusses more some of the implications for policy and for other types of ALMPs.
- An Iron Law of ALMPs: offer a program to 100, help 2 get jobs?
2. No studies have sufficient power to detect a 2 percentage point effect, so the modal study does not find a significant impact.
17 out of these 22 studies do not find a statistically significant impact on employment. I note in the paper that, for example, if 13 percent of the control group get employed anyway, you need samples of 6,424 treatment and 6,424 control to have 80 percent power to detect a 2 percentage point treatment effect – this is larger than all of the ALMP experiments covered.
3.It gets worse: attrition and spillovers
This second figure shows the attrition rates for the same 22 studies. The median and mean attrition rates are both 18 percent. So these studies are trying to detect an impact of only 2 to 3 percentage points, but are losing almost 10 times this in attrition. Of course they try and do the usual things of looking at balance on baseline observables for the non-attritors, and comparing attrition rates for treatment and control and using Lee bounds where they differ. But clearly any Manski-type bounds will be completely uninformative, and we do need to worry that attrition might be differently selecting on the outcome in the two groups: e.g. perhaps those who go through your program and don’t find a job don’t want to answer the survey, while in the control group it might be those who are employed are too busy to answer.
Finally, on top of all this, is the concern of spillovers/general equilibrium – namely that these programs merely change which job-seekers get the job openings available, without changing the number of jobs in the economy. This is a tough challenge to deal with, although I point in the paper to a couple of studies that have attempted to look at this question by randomizing at the local labor market level, or by randomizing at the firm level to see if firms create more jobs when linked to some of these programs.
4. Can we then justify these programs on other outcomes or subgroups?
When faced by this lack of impact on employment, studies have often used one or both of two approaches to present positive impacts of the program. The first has been to de-emphasize employment as an outcome, saying that in low-income economies everyone typically works a bit, so what matters is the quality of the job – and so they then present impacts on formal employment, or permanent employment, or wage employment (rather than self-employment). Secondly, they examine impacts for subgroups, and emphasize that while there is no impact overall, the program has significant impacts for one gender, or one other group.
While both of these seem perfectly reasonable, I note a few concerns with this approach in the paper.
What is a quality job? I note two main objections to the way quality jobs have been used to justify impacts. The first is that there is a long debate on “what is a quality job”. I am particularly resistant to definitions in which it is assumed that formal wage employment = good, and self-employment = bad. There is a long literature that emphasizes the tradeoffs individuals make in choosing between sectors, and the heterogeneity in both sectors. Second, if job quality is better, we should expect to see people earning more. But none of the studies finds significant impacts on overall labor earnings.
No consistency on gender: it does seem perfectly plausible that programs will work better for some individuals than others, and thus how to target these programs is an important research question. However, there appears to be a view that we know that these programs work better for women than men, which I think largely comes from the vocational training work of Attanasio and co-authors in Colombia. But I note two problems with this: first, a number of studies make the mistake of seeing significant coefficients for one gender and not for another, and so emphasizing the program works well for the first and not the second – without ever formally testing for equality of impacts. Second, table 2 in my paper shows that of 11 vocational training studies, they all either don’t reject equality of impact by gender, don’t test for equality, or, in studies from Argentina and the Dominican Republic, reject equality in favor of stronger impacts for men.
5.Should we just give up on these programs?
Not all of them. I note that the return to vocational training seems consistent with what we see on returns to schooling per se – it is just that policymakers and participants dramatically overestimate how useful such programs can be. Moreover, some of the screening and matching programs are cheap enough, that a lasting 2 percentage point impact could still be enough to justify them on a cost-benefit basis. But these evaluations definitely call some of the assumptions used to justify these programs into question. I therefore conclude by discussing new directions for ALMPs.