Published on Development Impact

Tools of the Trade: a joint test of orthogonality when testing for balance

This page in:
This is a very simple (and for once short) post, but since I have been asked this question quite a few times by people who are new to doing experiments, I figured it would be worth posting. It is also useful for non-experimental comparisons of a treatment and a control group.
Most papers with an experiment have a Table 1 where they compare the characteristics of the treatment and control group and test for balance. (See my paper with Miriam Bruhn for discussion of why this often isn’t a sensible thing to do). Ok, but let’s assume you are in a situation where you want to do this. One approach people use is just to do a series of t-tests comparing the means of the treatment and control group variable by variable. Or they might do this with regressions of the form:
X  = a + b*Treat +e
And test whether b=0.

They might do this for 20 variables, find 1 or 2 are significant at the 5% level, and then say “this is about what we expect by chance, so it seems randomization has succeeded in generating balance”.  But what if we find 3 or 4 differences out of 20 to be significant? Or what if none are individually significant, but the differences are all in the same direction.

An alternative, or complementary approach is to test for joint orthogonality. To do this, take your set of X variables (X1, X2, …, X20) and run the following:
Treat = a + b1*X1 + b2*X2 + b3*X3 + ….+b20*X20 +u
And then test the joint hypothesis b1=b2=b3=…=b20=0
This can be run as a linear regression, with an F-test; or as a probit, with a chi-squared test.
That’s it, very simple. I think people get confused because the treatment variable jumps from being on the right-hand side for the single variable tests to being on the left-hand side for the joint orthogonality test.
Now what if you have multiple treatment groups? You can then run a multinomial logit or your other preferred specification and test for joint orthogonality within this framework, but I’ve not seen this done very often – typically I see people just compare each treatment separately to the control.


David McKenzie

Lead Economist, Development Research Group, World Bank

Join the Conversation

The content of this field is kept private and will not be shown publicly
Remaining characters: 1000