Published on Development Impact

How can we do better business training evaluations?

Markus Goldstein

September 18, 2012

This page in:

Last week I blogged about a paper that David wrote with Chris Woodruff which takes stock of the existing evidence on the impact of business trainings. The bottom line was that we still don’t know much. Part of the reason is that these types of evaluations are not straightforward to do – they have some pitfalls that you don’t always find in your garden variety impact evaluation. So today I’ll talk about some of the lessons that I learned from their paper, and throw in one or two lessons from my own recent experiences (and frustrations) with doing some business training evaluations.

First, one of the reasons a lot of the studies don’t find effects is that they have low statistical power. So small sample size is an issue. And, as David and Chris point out, heterogeneity in the types of firms compounds this – and this is probably the most significant issue.

Ok, so the basic issue of power is a garden variety problem – but there are some things specific to businesses that make getting to a decent level of power a bit harder. First, businesses aren’t as plentiful as people (although by some definitions they may be people, but that’s a topic for another blog). This might not be a huge problem for a survey (just more expensive), it is for an intervention – there is usually only some much area that the program can cover. This will limit the sample size.

Second, uptake is far from universal. Chris and David calculate an average take-up rate of 65% across the studies they look at. So my answer would be to capture intent to participate in the baseline or, better still, the census. But alas, they cite two studies which focus on samples who expressed interest in attending the training course and these clock in at 39 and 51 percent participation. This low take-up further reduces power.

In the end, given all of these constraints on power, one of the better options might be to do more follow up surveys. For a discussion of this, see David’s previous post on this issue.

Another issue for these types of evaluation is their short follow up period. Almost all of the studies David and Chris look at use, at most, a one year follow up. Unfortunately, the time it takes training to take hold, for entrepreneurs to figure out how to use the lessons to maximum effect, is an open question (they only have three studies which go further – looking at impacts after 2 years).

A third big issue is attrition. David and Chris site attrition that ranges from a comfortably low 5.3% to 24, 26, and 28%. Attrition seems to be particularly high among studies in Latin America. Chris and David’s explanation: “our personal experience has been that in many Latin American countries firms are less willing to answer surveys than in Africa or South Asia, and transitions between wage work and self employment are frequent in the region.”

Even when entrepreneurs respond to the survey, they may be loath to answer questions on profits because they fear theft (similar to asking about savings at home with households) or the taxman. David and Chris cite one study where the overall attrition rate was only 13 percent, but 46 percent of the sample did not answer the question on revenues.

Somewhat related to attrition is the issue of selective survival and start-up. This is akin to the issue faced when child mortality is reduced – the training might cause businesses to survive which would have failed had they been in the control group – which could then lead to estimates of profits which understate the impact of training. One of the options David and Chris discuss to deal with this is to do some kind of bounds analysis.

A fifth issue is that the training may alter the responses on certain questions. For example, if the training is on record keeping, then the treatment group is possibly going to give you a different quality of answers from the control group, even if the training had no impact. Few studies have dealt with this issue, but among the ones that do, Chris and David find mixed evidence that this changes responses. But still, this is something to think about and maybe even try to capture so we can get some further insight into this issue.

Finally (and this an issue for training programs more generally) these programs have more heterogeneous implementation than other programs for business support. Think of a credit or matching-grant program. Folks get X dollars. But here, the central intervention is training and it may be implemented by a range of training providers with different skills and emphases. The more entrepreneurs the program covers (and thus the greater the sample) the more likely this is the case.

So these are some things to think about as you head off to do a business training evaluation. At the end of the day, more frequent surveys, over a longer period will help. In addition, it is important to pre-test your questionnaire extensively to deal with measurement problems, perhaps even complementing the questions with direct observation. And it wouldn’t hurt to try and track the evolution (or devolution) of benefits over time.

Get updates from Development Impact

Authors

Markus Goldstein

Lead Economist, Africa Gender Innovation Lab and Chief Economists Office

More Blogs By Markus

Join the Conversation

The content of this field is kept private and will not be shown publicly

Remaining characters: 1000

I have read the Privacy Notice and consent to my personal data being processed, to the extent necessary, to submit my comment for moderation. I also consent to having my name published.