This is an excellent resource and write-up. Thanks David and Jason for blogging, and thanks Simon for writing the code.

I have also been struggling with how to test equality of coefficients with two treatments, using randomization inference.

I think the p-value should be constructed by calculating the proportion of times where the *difference* in observed (re-randomized) treatment impacts is larger than the difference in the true treatment impacts.

For example, take a dataset where “b1_estimated” and “b2_estimated” are the stored coefficients of the two treatment dummies, after running the main regression thousands of times, each time randomly re-assigning clusters to treatment. The code to calculate the randomization inference p-value would be:

count if true_difference < abs(b2_estimated – b1_estimated)
gen p1 = `r(N)' / obs

where “true_difference” is a constant of the true difference in treatment impacts; and "obs" is the number of repetitions. With this code I get a very similar p-value to the simple t-test of equality of coefficients.

But I am not sure if this can be done with the ritest command. At first I thought I could take a short-cut by setting one of the treatment arms as omitted category and throw in a dummy for the control (e.g. gen T0 = treat == 0)

E.g. ritest assigntreat _b[T2], reps(1000) strata(strata) cluster(uid) seed(124): reg y T0 T2 $controls, cluster(uid)

But this does not take into account the statistical uncertainty in assignment of treatment to T1. With this method one calculates:

count if abs(true_difference) < abs(b2_estimated)
gen p2 = `r(N)' / obs

But this under-estimates the p-value. In my data, p2>p1. (Intuitively, the variance of the difference between two normally distributed random variables is higher than the variance of each of the random variable.)

Let me know if you think this is the correct approach.

This is an excellent resource and write-up. Thanks David and Jason for blogging, and thanks Simon for writing the code.

I have also been struggling with how to test equality of coefficients with two treatments, using randomization inference.

I think the p-value should be constructed by calculating the proportion of times where the *difference* in observed (re-randomized) treatment impacts is larger than the difference in the true treatment impacts.

For example, take a dataset where “b1_estimated” and “b2_estimated” are the stored coefficients of the two treatment dummies, after running the main regression thousands of times, each time randomly re-assigning clusters to treatment. The code to calculate the randomization inference p-value would be:

count if true_difference < abs(b2_estimated – b1_estimated)

gen p1 = `r(N)' / obs

where “true_difference” is a constant of the true difference in treatment impacts; and "obs" is the number of repetitions. With this code I get a very similar p-value to the simple t-test of equality of coefficients.

But I am not sure if this can be done with the ritest command. At first I thought I could take a short-cut by setting one of the treatment arms as omitted category and throw in a dummy for the control (e.g. gen T0 = treat == 0)

E.g. ritest assigntreat _b[T2], reps(1000) strata(strata) cluster(uid) seed(124): reg y T0 T2 $controls, cluster(uid)

But this does not take into account the statistical uncertainty in assignment of treatment to T1. With this method one calculates:

count if abs(true_difference) < abs(b2_estimated)

gen p2 = `r(N)' / obs

But this under-estimates the p-value. In my data, p2>p1. (Intuitively, the variance of the difference between two normally distributed random variables is higher than the variance of each of the random variable.)

Let me know if you think this is the correct approach.