**** Program which gives an example to show that even if you cluster standard errors, **** your power is lower when intra-class correlation is higher **** Keeping overall error variance constant * Set working director cd C:\randomization\ *** Generate file to save values in clear foreach var in iteration coeff1 se1 coeff2 se2 coeff3 se3 sd1 sd2 sd3 { g `var'=. } save stats_cluster, replace clear set seed 124 * -------------------------------------------- * * Taking 1000 random draws * * -------------------------------------------- * local maxiter = 1000 local iter = 1 while `iter' <= `maxiter' { clear set obs 1000 gen iteration = `iter' * Generate clusters gen u=invnormal(uniform()) egen clusters = cut(u), g(50) * Set treatment effect gen treat=1 *** Assign clusters to treatment and control *** 50 clusters of 20 individuals each - assign 25 of them to control *** they are numbered 0 through 49 gen treated=clusters<=24 *** Generate error terms for group level egen rank1=rank(u), by(clusters) gen randomG = rnormal(0,1) if rank1==1 egen groupvar=max(randomG), by(clusters) *** Generate individual error term gen randomI = rnormal(0,1) *** Generate y's varying intra-cluster correlation, keeping total variance constant ** intra-cluster correlation of 0 gen y1 = 5 + 0.5*treated+0*groupvar + 1*randomI egen sd1=sd(y1) ** intra-cluster correlation of 0.5 gen y2 = 5 + 0.5*treated+0.707107*groupvar+0.707107*randomI egen sd2=sd(y2) ** intra-cluster correlation of 1 gen y3 = 5 + 0.5*treated +1*groupvar + 0*randomI egen sd3=sd(y3) * Regressions reg y1 treated, r cluster(clusters) gen coeff1 = _b[treated] gen se1 = _se[treated] reg y2 treated, r cluster(clusters) gen coeff2 = _b[treated] gen se2 = _se[treated] reg y3 treated, r cluster(clusters) gen coeff3 = _b[treated] gen se3 = _se[treated] keep iteration coeff1-se3 sd1 sd2 sd3 keep in 1 append using stats_cluster sort iteration save stats_cluster, replace local iter = `iter' + 1 } twoway (histogram coeff1) (histogram coeff2) twoway (histogram se1) (histogram se2) (histogram se3) *** Now calculate t-statistics gen t1=coeff1/se1 gen t2=coeff2/se2 gen t3=coeff3/se3 kdensity t1, gen(atx1 aty1) kdensity t2, gen(atx2 aty2) kdensity t3, gen(atx3 aty3) sort atx1 twoway line aty1 aty2 aty3