Tools of the Trade
Today, I am writing about something many of you already know. You’ve probably been hearing about it for 5-10 years. But, you still ignore it. Well, now that the evidence against it has mounted enough and the fix is simple enough, I am here to urge you to tweak your regression specifications in your program evaluations.
- Tools of the Trade
In ancient Greek times, important decisions were never made without consulting the high priestess at the Oracle of Delphi. She would deliver wisdom from the gods, although this advice was sometimes vague or confusing, and was often misinterpreted by mortals. Today I bring word that the high priestess and priests (Athey, Abadie, Imbens and Wooldridge) have delivered new wisdom from the god of econometrics on the important decision of when should you cluster standard errors. This is definitely one of life’s most important questions, as any keen player of seminar bingo can surely attest. In case their paper is all greek to you (half of it literally is), I will attempt to summarize their recommendations, so that your standard errors may be heavenly.
Randomization inference has been increasingly recommended as a way of analyzing data from randomized experiments, especially in samples with a small number of observations, with clustered randomization, or with high leverage (see for example Alwyn Young’s paper, and the books by Imbens and Rubin, and Gerber and Green). However, one of the barriers to widespread usage in development economics has been that, to date, no simple commands for implementing this in Stata have been available, requiring authors to program from scratch.
This has now changed with a new command ritest written by Simon Hess, a PhD student who I met just over a week ago at Goethe University in Frankfurt. This command is extremely simple to use, so I thought I would introduce it and share some tips after playing around with it a little. The Stata journal article is also now out.
How do I get this command?
Simply type findit ritest in Stata.
[edit: that will get the version from the Stata journal. However, to get the most recent version with a couple of bug fixes noted below, type
net describe ritest, from(https://raw.githubusercontent.com/simonheb/ritest/master/)
Here is a familiar scenario for those running field experiments: You’re conducting a study with a treatment and a comparison arm and measuring your main outcomes with surveys and/or biomarker data collection, meaning that you need to contact the subjects (unlike, say, using administrative data tied to their national identity numbers) – preferably in person. You know that you will, inevitably, lose some subjects from both groups to follow-up: they will have moved, be temporarily away, refuse to answer, died, etc. In some of these cases there is nothing more you can do, but in others you can try harder: you can wait for them to come back and revisit; you can try to track them to their new location, etc. You can do this at different intensities (try really hard or not so much), different boundaries (for everyone in the study district, region, or country, but not for those farther away), and different samples (for everyone or for a random sub-sample).
Question: suppose that you decide that you have the budget to do everything you can to find those not interviewed during the first pass through the study areas (doesn’t matter if you have enough budget for a randomly chosen sub-sample or everyone), i.e. an intense tracking exercise to reduce the rate of attrition. In addition to everything else you can do to track subjects from both groups, you have a tool that you can use for those only in the treatment arm (say, your treatment was group-based therapy for teen mums and you think that the mentors for these groups may have key contact information for subjects who moved in the treatment group. There were no placebo groups in control, i.e. no counterpart mentors). Do you use this source to track subjects – even if it is only available for the treatment group?
About a year ago, I wrote a blog post on issues surrounding data collection and measurement. In it, I talked about “list experiments” for sensitive questions, about which I was not sold at the time. However, now that I have a bunch of studies going to the field at different stages of data collection, many of which are about sensitive topics in adolescent female target populations, I am paying closer attention to them. In my reading and thinking about the topic and how to implement it in our surveys, I came up with a bunch of questions surrounding the optimal implementation of these methods. In addition, there is probably more to be learned on these methods to improve them further, opening up the possibility of experimenting with them when we can. Below are a bunch of things that I am thinking about and, as we still have some time before our data collection tools are finalized, you, our readers, have a chance to help shape them with your comments and feedback.