My last post  discussed an example of a system intervention (improvements to the pharmaceutical supply chain) and the not uncommon inferential challenge of low power from relatively few units of observation. I planned to follow-up in the next post with an overview of select randomization methods that can (but not necessarily will) boost power in these cases. Several astute commenters on the previous post beat me to the punch line – a highly encouraging omen for this new blog.
Two of the tried and true methods applied researchers have in their toolkit, useful both for improving post-randomization balance of important covariates as well as improving power, are:
Blocking/Stratification, where study units are randomized within strata of baseline observable characteristics, ideally including baseline outcomes of interest. Of course stratification is applicable only over a relatively small number of discrete characteristics due to the “curse” of dimensionality – 4 binary block variables yield 16 strata, 5 variables yield 32 strata and so on. In a small study there may soon be more desired strata then units of observation.
Pair-wise Matching, where pairs of study units are matched through a minimum distance estimator that can encompass numerous covariates. Within pair random assignment to treatment or control follows after the matching process. This approach doesn’t necessarily guarantee the balance of any particular covariate, but can significantly improve power. Imai, King, and Nall  have a rigorous overview of this approach as well as an application to an evaluation of health insurance in Mexico.
I’d like to thank one of last week’s commenters for pointing me to Jake Bowers’ review chapter  from the political science literature. This paper reviews the aforementioned methods from the perspective of a particularly low-powered study. The paper also includes a nice intuitive discussion of the pitfalls of ex-post covariance adjustment.
In a recent hallway chat with my colleague (and fellow blogger) David McKenzie, he reminded me of his relevant paper  with Miriam Bruhn. I had actually read an early version of this paper, and was happy I returned to this updated published version. David and Miriam quantify the gains in power from blocking and pair-wise matching in a variety of development relevant empirical settings through monte carlo analysis. They focus on situations when the study is relatively small (30 observations) and medium-sized (300 observations). The average gain in power in these stylized settings for the small sample study is 14% for stratification (on 2 variables) and 29% for pair-wise matching, with generally greater gains in power for more persistent outcomes. Clearly in a small N world, one of these methods may not only be helpful but critical. (For the medium size study simulations, the gains in power from these two methods diminish dramatically when compared with a simple random draw.)
In my supply chain study, we actually did block districts before treatment assignment. However at that point of the study process, we had relatively limited information on which to stratify and unfortunately we did not yet have any information on the outcome of interest at the clinic level – drug stock-out rates. So we blocked study districts by geographic variables such as province and rural or peri-urban designation. These factors only accounted for 4% of baseline variation in drug stock-outs and thus the anticipated gains in power were expected to be small.
This discussion highlights the additional importance of comprehensive baseline data, especially of baseline outcomes of interest, in small sample studies. Power can be significantly boosted if this data is available pre-assignment and utilized in the randomization process.