Attrition is a bugbear for most impact evaluations, and can cause even the best designed experiments to be subject to potential bias. In a new paper , Luc Behaghel, Bruno Crépon, Marc Gurgand and Thomas Le Barbanchon describe a clever new way to deal with this problem using information on the number of attempts it takes to get someone to respond to a survey.
The intuition is easily seen in Figure 3 from the paper below. Suppose people differ in the amount of effort it takes to get them to answer a survey, and that the treatment may affect the overall willingness of people to participate in the survey, but not their relative rank within the treatment group in terms of how easy it is to contact them. In their empirical example, only 51 percent of the control group sample answer the survey even after up to 25 phone attempts. Their idea is to basically trim away the extra responders in the treatment group (those above the horizontal red line), so that we are comparing the 51 percent of the treatment group who are easiest to reach with the 51 percent of the control group who could be reached. If the intensity of effort to get people to respond was continuous, this would yield point identification.
However, since the number of phone calls is an integer, slightly less than 51 percent of the treatment group responds with 6 phone calls, and slightly more with 7 phone calls. The solution is to take the number of calls (7) which is higher but closest to that in the control group, and then as with standard Lee bounds, create upper and lower bounds by trimming the treatment group and making assumptions about the additional observations. For example, if the outcome is being employed, and 53% of the treatment group respond with 7 calls, then one trims (53-51)/53 = 3.7% of the treated observations, with a lower (upper) bound for the employment impact being obtained when it is assumed these marginal observations, who are observed in the treatment group but who wouldn’t have been observed had they been in the control group, are all employed (unemployed).
The authors illustrate this with data from a french employment experiment, and show that they would obtain Manski extreme bounds of [-35.6%, +49.8%], Lee bounds of [0.3%, 21.7%] versus bounds of [10.1%, 12.7%] using their method. So using their method gives a big improvement in precision.
· It is important to note that this method identifies an average treatment effect for the subsample of respondents who would respond regardless of treatment status. This may differ from the unknown treatment effect on the full sample.
· The method only offers benefit over Lee bounds when attrition is unbalanced between treatment and control groups – if attrition is large, but has equal rates for treatment and control groups, both Lee bounds and their bounds collapse to a point estimate, and is equivalent to ignoring attrition and assuming the sample that answer is similar in both treatment and control.
· The monotonicity assumption being made is likely to be reasonable in many settings but may be questionable in others. For example, in many interventions take-up of the treatment is less than 100%. We might think that those who take-up the treatment and find it useful are more likely to agree to be re-surveyed than the control group, whereas those who refused the treatment or who took the treatment but had a bad experience with it may be even less likely to be surveyed than if they were in the control group. This would violate the identifying assumption, and this method would then not work.
· Just as we typically test for selective attrition by looking to see if the sample who answer the survey is balanced on observable characteristics, one should likewise be able to check whether the trimmed sample being used here (e.g. all of the control group, and the treatment group who it took 7 or fewer attempts to get to answer) is balanced in terms of observables. Likewise one can check whether the monotonicity assumption seems to hold for a few key variables, by e.g. checking that it is not the case that response rates are higher for treatment than for control for more educated people, but lower for less educated, etc. Obviously this does not necessarily mean that there is balance on unobservables, but it is a start.
· Other studies have been using an approach which randomly varies the intensity of effort in order to create an IV. As the authors note, this amounts to wasting part of the sample (the part on which data collection is not maximized), and so this approach may be more attractive in many cases.
In short, this seems like a useful tool to add to our arsenal, and is something that I will be trying in analyzing a current experiment that has struggled with attrition issues. Hopefully others will also find it of use.