Back-of-the-envelope power calcs

|

This page in:

This is job market season but — oh no! — I have this grant application due today and no data or time to run the power calcs! What am I going to do?!? More generally, it’s often useful to have a grasp of what is or is not likely to be powered when discussing potential research designs with implementing partners, even without pulling out a pad and paper. If you’ve ever been in this position, this may be a calming read for you.

 

At this point, the formula for the minimum detectable effect we all know and love is something we’re very used to:

\[ \text{MDE}_{0.8} = 2.8 \sigma \sqrt{\frac{1}{N_{C}} + \frac{1}{N_{T}}} \]

where \( \sigma \) is the standard deviation of the residual error of the outcome, \( N_{C} \) is the number of observations in the control group, and \( N_{T} \) is the number of observations in the treatment group.

 

To simplify things, we make the happy observation that many of the outcomes we care about are binary (poverty, graduation, program participation, …). Good news: if you know the mean of a binary outcome, you know its variance. The formula (for a binary random variable \( Y \)) goes: 

\[ \text{Var}[Y] = \text{Pr}[Y = 1] (1 - \text{Pr}[Y = 1]) \]

\[ \text{Pr}[Y = 1] = \mathbf{E}[Y] \]

 

Plotting things allows us to quickly come up with likely values of the standard deviation of our outcome of interest. 

Image
Mean to standard deviation for a binary

 

Based on this, it is clear that we can keep things really simple and yet conservative by admitting that binary variables with mean between 0.3 and 0.7 have a standard deviation of approximately 0.5. This quickly simplifies the formula to:

\[ 1.4 \sqrt{\frac{1}{N_{C}} + \frac{1}{N_{T}}} \]

 

Number of observations. The most common case where this is applied is the comparison between two equally sized arms. In this case, the formula simplifies to:

\[ 2.8 / \sqrt{N} \]

 

The minimum detectable effect for a binary outcome (with mean between 0.3 and 0.7) with 100 observations evenly split between treatment and control is about 0.28; for 400 observations it’s about 0.14, and for 1000 observations it’s about 0.09. Correct for incomplete take up (divide MDE by program compliance), or perhaps add a clustering adjustment (10 observations per cluster with an intracluster correlation of 0.1 increases your MDE by 40%). And don’t forget to think carefully about what effect you’d need to detect to learn something meaningful from the intervention. Done.

 

Some other common rules of thumb These simplifications can be applied to other common variables of interest. Clearly, any standardized variable has a standard deviation of 1 (e.g., test scores). Many always positive variables (e.g., wages, income, plot size, …) have mean approximately equal to the standard deviation; after normalizing by the mean, e.g., to calculate a percentage impact, the standard deviation will then be 1. And the standard deviation of log(stuff) is commonly between 0.5 and 1 (e.g., log wages, log income, log plot size, …). To apply the above MDE formula to these cases, all you need to do is double it!

Join the Conversation

Juliane
January 30, 2020

Thanks a lot for this -- super helpful!

February 04, 2020

Thank you !!!

Dan Stein
January 30, 2020

Oh this article is so good. But isn't it cheating to give away for free all this intuition that we've been painstakingly developing for years? Just kidding. I'm going to circulate this around IDinsight, and hopefully it will nudge people to stop spending so long on power calcs!

February 04, 2020

means a lot coming from you, Dan! I still remember your spreadsheet :)

Kris
January 30, 2020

Hi! Thank you for this blog post. Is it advisable to do this during the analysis stage (i.e. after data has been collected)? One of our reviewers commented: How much of the lack of statistical significance on xxxx is driven by lack of statistical power? The authors should provide a back-to-the envelope assessment of what is the power of the tests given the sample size (a classic reference to look at would be Andrews, Donald W. K. 1989. “Power in Econometric Applications.” Econometrica 57(5):1059–1090)

January 30, 2020

See our post here on ex-post power calculations:
https://blogs.worldbank.org/impactevaluations/why-ex-post-power-using-e…

Kris
February 07, 2020

Thank you. Very helpful article! Would you be familiar with the inverse power function approach proposed by Donald Andrews in the 1989 Econometrica article? Does this suffer from the same shortcomings discussed in your other blog article?