I’ve been travelling the past week, and had several people contact me with questions about impact evaluation while away. I figured these might come up again, and so I’d put up the questions and answers here in case they are useful for others.
Question 1: Winsorizing – “do we do this on the whole sample, or do we do it within treatment and control, baseline and follow-up?”
Winsorizing is commonly used to deal with outliers, for example, you might set all data points above the 99th percentile equal to the 99th percentile. It is key here that you don’t use different cut-offs for treatment and control. For example, suppose you have a treatment for businesses that makes 4 percent of the treatment group grow their sales massively. If you winsorize separately at the 95th percentile of the treatment distribution for the treatment group and at the 95th percentile of the control distribution for the control groups, you might end up completely missing the treatment effect. I think it makes sense to do this with separate cutoffs by survey round to allow for seasonal effects and so you aren’t winsorizing more points from one round than another (which could be the case if you used the same global cutoffs for all rounds).