Randomization inference has been increasingly recommended as a way of analyzing data from randomized experiments, especially in samples with a small number of observations, with clustered randomization, or with high leverage (see for example Alwyn Young’s paper, and the books by Imbens and Rubin, and Gerber and Green). However, one of the barriers to widespread usage in development economics has been that, to date, no simple commands for implementing this in Stata have been available, requiring authors to program from scratch.
This has now changed with a new command ritest written by Simon Hess, a PhD student who I met just over a week ago at Goethe University in Frankfurt. This command is extremely simple to use, so I thought I would introduce it and share some tips after playing around with it a little. The Stata journal article is also now out.
How do I get this command?
Simply type findit ritest in Stata.
[edit: that will get the version from the Stata journal. However, to get the most recent version with a couple of bug fixes noted below, type
net describe ritest, from(https://raw.githubusercontent.com/simonheb/ritest/master/)
- Marc Bellemare on the subject of my dissertation work – using repeated cross-sections
- From Next Billion, a summary of research showing how saving leads people to generate more income by working harder
- In the Guardian, how the World Bank is nudging health and hygiene in several projects…and the defense against whether this distracts from more structural issues “Why not make all programmes as effective as possible, even if it doesn’t turn a very poor country into a Scandinavian country overnight”
- Also from the Guardian, 10 sources of data for international development research
- randtreat – a new Stata command to do random assignment that can deal with uneven numbers of observations (more details here) – this builds on an old blog post I did on the issue, and great to see some of these practical issues getting made easier for everyone.
- synth_runner – the IDB’s Development that Works blog has a post about a new Stata command to help automate use of the synthetic control method.
A common question of interest in evaluations is “which groups does the treatment work for best?” A standard way to address this is to look at heterogeneity in treatment effects with respect to baseline characteristics. However, there are often many such possible baseline characteristics to look at, and really the heterogeneity of interest may be with respect to outcomes in the absence of treatment. Consider two examples:
A: A vocational training program for the unemployed: we might want to know if the treatment helps more those who were likely to stay unemployed in the absence of an intervention compared to those who would have been likely to find a job anyway.
B: Smaller class sizes: we might want to know if the treatment helps more those students whose test scores would have been low in the absence of smaller classes, compared to those students who were likely to get high test scores anyway.
- Stata commands
In clustered randomized experiments, random assignment occurs at the group level, with multiple units observed within each group. For example, education interventions might be assigned at the school level, with outcomes measured at the student level, or microfinance interventions might be assigned at the savings group level, with outcomes measured for individual clients.